Image Sequence Descriptors

Image sequence descriptors control how data is loaded and processed by our tools.

Overview

Skip Robotics tooling that works with image data uses image sequence descriptors to define where the data comes from. At a high level, these descriptors follow the format of a URI.

Image Source URIs

URIs are used to specify a raw image source (images on disk, a USB camera, etc.). The scheme portion of the URI defines the type of source. The path portion of the URI defines where to find the data. Additional options can be added via query strings at the end of the URI.

As an example, the following descriptor will get an image stream from a UVC camera at 10 frames per seconds from /dev/video4:

uvc:///dev/video4?fps=10

Each image source URI starts with a scheme specifier. There are several available:

A data path / device identifier follows after the scheme name. For live image sources this is the identifier of a physical device. For previously recorded data, it is the path to its location on disk.

In addition, each image source supports optional parameters specified as URL query arguments. The details and options for each supported image source are described below.

uvc / aravis

The uvc and aravis image sequence drivers support the same set of URL query parameter options:

The device identifiers used in the base URI differs between the uvc:// and aravis:// sequence types:

file

File-based image sequences are used to turn a folder full of individual images into a stream. Both monocular and stereo datasets are supported.

The file image sequence URI consists of a filesystem path to a directory, along with with a regex expression for the set of files to process: /path/to/directory/REGEX_EXPRESSION. Image files which match the regular expression are turned into an ordered stream using the lexical order of the filenames.

For example, the following sequence descriptor will reference all of the png files in a directory:

file:///path/to/data/image_.*\.png

Note the regex expression at the end of the path uses .* as a wildcard and \. to escape the dot before the file extension. Regex expressions are only allowed for the filename at the end of the path. The C++ standard library regex parser / format is used.

To support stereo image sequences, a ?stereo parameter can be added. In this case the regex should refer to all image files for both cameras. The filenames must contain either a left or right identifier string to be associated with the correct camera. The specific identifiers used for the left/right association can be explicitly passing ?left_id=X&right_id=Y with two distinct strings for X and Y. For example,

file:///path/to/data/image_.*\.png?stereo

would match image names of the form image_left_000.png, image_right_000.png, etc. The above expression is shorthand for the more explicit form:

file:///path/to/data/image_.*\.png?left_id=left&right_id=right

rosbag

The first portion of the URI points to the location of the bag file on disk. A topic query parameter defines which image topic in the rosbag will be read. Note that only uncompressed image topics are supported at this time. Example:

rosbag://relative/path/to/log.bag?topic=/camera1

video

The URI should point to a video file on disk (.avi, .mpeg, etc. ). No additional parameters are supported. The file format should be one of the ones supported by the version of OpenCV shipped with the Linux distribution.

video:///absolute/path/to/video.avi

Actions

Actions are commands which can be layered on top of raw image sources. These provide basic pre-processing functionality for manipulating image streams.

The follow actions are currently supported:

Additional Examples

Create a stereo stream out of two monocular camera sources:

stereo(uvc:///dev/video0?fps=10, uvc:///dev/video1?fps=10)

Create a stereo stream as above, but enforce frame timestamp synchronization to within 1ms; failure to synchronize for more than 10 frames causes a failure:

stereo<desync_sec=0.001, skip_max=10>(uvc:///dev/video0?fps=10, uvc:///dev/video1?fps=10)

Blur a stream from an aravis camera source with a 0.75 pixel Gaussian kernel:

blur<sigma=0.75>(aravis://FLIR-1E10015FA3A4-015FA3A4?fps=30&tare_clock)

In this example FLIR-1E10015FA3A4-015FA3A4 is the aravis camera id.