Step 1, Understanding the Data to Provide
dataspan.ai supports all commonly used image formats, as well as the most popular associated image metadata formats (annotation files) that describe bounding boxes and segmentation masks. Here’s a quick description of the data to supply and the formats supported by dataspan.ai.
Foreground Dataset: The "What"
The foreground Dataset contains the objects or elements to be inserted into the background images. For instance, if you're training a model to detect defects on lumber, your foreground Dataset might include images of resin blemishes and knots. Typically, you should provide 5 to 15 images of each defect type. These images are usually part of an object detection or segmentation Dataset, where each defect is marked with a bounding box or segmentation mask.
Bounding Boxes and Segmentation Masks
These are essential for defining the precise location of the foreground objects (such as defects) within the provided images. dataspan.ai supports several popular formats for this metadata:
Pascal VOC XML: Suitable for bounding boxes.
YOLOv5: Supports bounding boxes and can also handle polygons in certain variants.
COCO JSON: This is the most versatile format, supporting segmentation masks, bounding boxes, and polygons.
Image Masks: dataspan.ai can also handle other common formats, such as grayscale or multi-class color-coded mask files that act as segmentation masks.
Background Dataset: The "Where"
The background Dataset consists of the images into which the foreground elements will be inserted. This Dataset should include numerous images that serve as the environment into which foreground objects are inserted (such as defects). For example, if your Project involves detecting lumber defects, the background Dataset would consist of images of unblemished lumber.
Tip: It is recommended to import the images of the foreground objects (the What to insert) as a separate Dataset from the background canvases (the Where to insert, such as separate folders for oak, pine, and rosewood).
Last updated