Export FiftyOne Datasets

Key Directives

ALWAYS follow these rules:

1. Load and understand the dataset first

set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")

2. Confirm export settings with user

Before exporting, present:

Dataset name and sample count
Available label fields and their types
Proposed export format
Export directory path

3. Match format to label types

Different formats support different label types:

| Format | Label Types | |--------|-------------| | COCO | detections, segmentations, keypoints | | YOLO (v4, v5) | detections | | VOC | detections | | CVAT | classifications, detections, polylines, keypoints | | CSV | all (custom fields) | | Image Classification Directory Tree | classification |

4. Use absolute paths

Always use absolute paths for export directories:

params={
    "export_dir": {"absolute_path": "/path/to/export"}
}

5. Warn about overwriting

Check if export directory exists before exporting. If it does, ask user whether to overwrite.

Complete Workflow

Step 1: Load Dataset and Understand Content

# Set context
set_context(dataset_name="my-dataset")

# Get dataset summary to see fields and label types
dataset_summary(name="my-dataset")

Identify:

Total sample count
Media type (images, videos, point clouds)
Available label fields and their types (Detections, Classifications, etc.)

Step 2: Get Export Operator Schema

# Discover export parameters dynamically
get_operator_schema(operator_uri="@voxel51/io/export_samples")

Step 3: Present Export Options to User

Before exporting, confirm with the user:

Dataset: my-dataset (5,000 samples)
Media type: image

Available label fields:
  - ground_truth (Detections)
  - predictions (Detections)

Export options:
  - Format: COCO (recommended for detections)
  - Export directory: /path/to/export
  - Label field: ground_truth

Proceed with export?

Step 4: Execute Export

Export media and labels:

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "COCO",
        "export_dir": {"absolute_path": "/path/to/export"},
        "label_field": "ground_truth"
    }
)

Export labels only (no media copy):

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "LABELS_ONLY",
        "dataset_type": "COCO",
        "labels_path": {"absolute_path": "/path/to/labels.json"},
        "label_field": "ground_truth"
    }
)

Export media only (no labels):

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_ONLY",
        "export_dir": {"absolute_path": "/path/to/media"}
    }
)

Step 5: Verify Export

After export, verify the output:

ls -la /path/to/export

Report exported file count and structure to user.

Supported Export Formats

Detection Formats

| Format | dataset_type Value | Label Types | Labels-Only | |--------|----------------------|-------------|-------------| | COCO | "COCO" | detections, segmentations, keypoints | Yes | | YOLOv4 | "YOLOv4" | detections | Yes | | YOLOv5 | "YOLOv5" | detections | No | | VOC | "VOC" | detections | Yes | | KITTI | "KITTI" | detections | Yes | | CVAT Image | "CVAT Image" | classifications, detections, polylines, keypoints | Yes | | CVAT Video | "CVAT Video" | frame labels | Yes | | TF Object Detection | "TF Object Detection" | detections | No |

Classification Formats

| Format | dataset_type Value | Media Type | Labels-Only | |--------|----------------------|------------|-------------| | Image Classification Directory Tree | "Image Classification Directory Tree" | image | No | | Video Classification Directory Tree | "Video Classification Directory Tree" | video | No | | TF Image Classification | "TF Image Classification" | image | No |

Segmentation Formats

| Format | dataset_type Value | Label Types | Labels-Only | |--------|----------------------|-------------|-------------| | Image Segmentation | "Image Segmentation" | segmentation | Yes |

General Formats

| Format | dataset_type Value | Best For | Labels-Only | |--------|----------------------|----------|-------------| | CSV | "CSV" | Custom fields, spreadsheet analysis | Yes | | GeoJSON | "GeoJSON" | Geolocation data | Yes | | FiftyOne Dataset | "FiftyOne Dataset" | Full dataset backup with all metadata | Yes |

Note: Formats with "Labels-Only: No" require export_type: "MEDIA_AND_LABELS" (cannot export labels without media).

Export Type Options

| export_type Value | Description | |---------------------|-------------| | "MEDIA_AND_LABELS" | Export both media files and labels | | "LABELS_ONLY" | Export labels only (use labels_path instead of export_dir) | | "MEDIA_ONLY" | Export media files only (no labels) | | "FILEPATHS_ONLY" | Export CSV with filepaths only |

Target Options

Export from different sources:

| target Value | Description | |----------------|-------------| | "DATASET" | Export entire dataset (default) | | "CURRENT_VIEW" | Export current filtered view | | "SELECTED_SAMPLES" | Export selected samples only |

Common Use Cases

Use Case 1: Export to COCO Format

For training with frameworks that use COCO format:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "COCO",
        "export_dir": {"absolute_path": "/path/to/coco_export"},
        "label_field": "ground_truth"
    }
)

Output structure:

coco_export/
├── data/
│   ├── image1.jpg
│   └── image2.jpg
└── labels.json

Use Case 2: Export to YOLO Format

For training YOLOv5/v8 models:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "YOLOv5",
        "export_dir": {"absolute_path": "/path/to/yolo_export"},
        "label_field": "ground_truth"
    }
)

Output structure:

yolo_export/
├── images/
│   └── train/
│       └── image1.jpg
├── labels/
│   └── train/
│       └── image1.txt
└── dataset.yaml

Use Case 3: Export Filtered View

Export only a subset of samples:

# Set context
set_context(dataset_name="my-dataset")

# Filter samples in the App
set_view(tags=["validated"])

# Export the filtered view
execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "target": "CURRENT_VIEW",
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "COCO",
        "export_dir": {"absolute_path": "/path/to/validated_export"},
        "label_field": "ground_truth"
    }
)

Use Case 4: Export Labels Only

When media should stay in place:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "LABELS_ONLY",
        "dataset_type": "COCO",
        "labels_path": {"absolute_path": "/path/to/annotations.json"},
        "label_field": "ground_truth"
    }
)

Use Case 5: Export for Classification Training

For image classification datasets:

set_context(dataset_name="my-classification-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "Image Classification Directory Tree",
        "export_dir": {"absolute_path": "/path/to/classification_export"},
        "label_field": "ground_truth"
    }
)

Output structure:

classification_export/
├── cat/
│   ├── cat1.jpg
│   └── cat2.jpg
└── dog/
    ├── dog1.jpg
    └── dog2.jpg

Use Case 6: Export to CSV

For analysis in spreadsheets:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "LABELS_ONLY",
        "dataset_type": "CSV",
        "labels_path": {"absolute_path": "/path/to/data.csv"},
        "csv_fields": ["filepath", "ground_truth.detections.label"]
    }
)

Use Case 7: Export FiftyOne Dataset (Full Backup)

For complete dataset backup including all metadata:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "FiftyOne Dataset",
        "export_dir": {"absolute_path": "/path/to/backup"}
    }
)

Output structure:

backup/
├── metadata.json
├── samples.json
├── data/
│   └── ...
├── annotations/
├── brain/
└── evaluations/

Python SDK Alternative

For more control, guide users to use the Python SDK directly:

import fiftyone as fo
import fiftyone.types as fot

# Load dataset
dataset = fo.load_dataset("my-dataset")

# Export to COCO format
dataset.export(
    export_dir="/path/to/export",
    dataset_type=fot.COCODetectionDataset,
    label_field="ground_truth",
)

# Export labels only
dataset.export(
    labels_path="/path/to/labels.json",
    dataset_type=fot.COCODetectionDataset,
    label_field="ground_truth",
)

# Export a filtered view
view = dataset.match_tags("validated")
view.export(
    export_dir="/path/to/validated",
    dataset_type=fot.YOLOv5Dataset,
    label_field="ground_truth",
)

Python SDK dataset types:

fot.COCODetectionDataset - COCO format
fot.YOLOv4Dataset - YOLOv4 format
fot.YOLOv5Dataset - YOLOv5 format
fot.VOCDetectionDataset - Pascal VOC format
fot.KITTIDetectionDataset - KITTI format
fot.CVATImageDataset - CVAT image format
fot.CVATVideoDataset - CVAT video format
fot.TFObjectDetectionDataset - TensorFlow Object Detection format
fot.ImageClassificationDirectoryTree - Classification folder structure
fot.VideoClassificationDirectoryTree - Video classification folders
fot.TFImageClassificationDataset - TensorFlow classification format
fot.ImageSegmentationDirectory - Segmentation masks
fot.CSVDataset - CSV format
fot.GeoJSONDataset - GeoJSON format
fot.FiftyOneDataset - Native FiftyOne format

Exporting to Hugging Face Hub

For complete HF Hub export documentation, see HF-HUB-EXPORT.md.

Quick reference:

| Method | Use Case | |--------|----------| | push_to_hub() | Personal accounts, simple upload | | Manual upload | Organizations, private org repos |

Quick start:

from fiftyone.utils.huggingface import push_to_hub

# Personal account
push_to_hub(dataset, repo_name="my-dataset", private=False)

# With options
push_to_hub(
    dataset,
    repo_name="my-dataset",
    description="My dataset description",
    license="apache-2.0",
    private=True,
)

IMPORTANT: Always generate and get user approval for dataset card before uploading. See HF-HUB-EXPORT.md for complete documentation including authentication setup, dataset card workflow, parameters reference, use cases, and troubleshooting.

Troubleshooting

Error: "Export directory already exists"

Add "overwrite": true to params
Or specify a different export directory

Error: "Label field not found"

Use dataset_summary() to see available label fields
Verify the field name spelling

Error: "Unsupported label type for format"

Check that the export format supports your label type
COCO: detections, segmentations, keypoints
YOLO: detections only
Classification formats: classification labels only

Error: "Permission denied"

Verify write permissions for the export directory
Check parent directory exists

Export is slow

Large datasets take time; consider exporting a view first
Export to local disk rather than network drives
For labels only, use LABELS_ONLY export type

Best Practices

Understand your data first - Use dataset_summary() to know what fields and label types exist
Match format to purpose - Use COCO/YOLO for training, CSV for analysis, FiftyOne Dataset for backups
Confirm with user - Present export settings before executing
Export filtered views - Only export what's needed rather than entire datasets
Verify after export - Check exported file counts match expectations
Use labels_path for LABELS_ONLY - When exporting labels only, use labels_path not export_dir

Resources

FiftyOne Export Guide
Supported Export Formats
FiftyOne I/O Plugin
FiftyOne Hugging Face Integration
Hugging Face Hub Documentation

Export FiftyOne Datasets

Key Directives

ALWAYS follow these rules:

1. Load and understand the dataset first

set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")

2. Confirm export settings with user

Before exporting, present:

Dataset name and sample count
Available label fields and their types
Proposed export format
Export directory path

3. Match format to label types

Different formats support different label types:

4. Use absolute paths

Always use absolute paths for export directories:

params={
    "export_dir": {"absolute_path": "/path/to/export"}
}

5. Warn about overwriting

Check if export directory exists before exporting. If it does, ask user whether to overwrite.

Complete Workflow

Step 1: Load Dataset and Understand Content

# Set context
set_context(dataset_name="my-dataset")

# Get dataset summary to see fields and label types
dataset_summary(name="my-dataset")

Identify:

Total sample count
Media type (images, videos, point clouds)
Available label fields and their types (Detections, Classifications, etc.)

Step 2: Get Export Operator Schema

# Discover export parameters dynamically
get_operator_schema(operator_uri="@voxel51/io/export_samples")

Step 3: Present Export Options to User

Before exporting, confirm with the user:

Dataset: my-dataset (5,000 samples)
Media type: image

Available label fields:
  - ground_truth (Detections)
  - predictions (Detections)

Export options:
  - Format: COCO (recommended for detections)
  - Export directory: /path/to/export
  - Label field: ground_truth

Proceed with export?

Step 4: Execute Export

Export media and labels:

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "COCO",
        "export_dir": {"absolute_path": "/path/to/export"},
        "label_field": "ground_truth"
    }
)

Export labels only (no media copy):

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "LABELS_ONLY",
        "dataset_type": "COCO",
        "labels_path": {"absolute_path": "/path/to/labels.json"},
        "label_field": "ground_truth"
    }
)

Export media only (no labels):

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_ONLY",
        "export_dir": {"absolute_path": "/path/to/media"}
    }
)

Step 5: Verify Export

After export, verify the output:

ls -la /path/to/export

Report exported file count and structure to user.

Supported Export Formats

Detection Formats

Classification Formats

Segmentation Formats

| Format | dataset_type Value | Label Types | Labels-Only | |--------|----------------------|-------------|-------------| | Image Segmentation | "Image Segmentation" | segmentation | Yes |

General Formats

Note: Formats with "Labels-Only: No" require export_type: "MEDIA_AND_LABELS" (cannot export labels without media).

Export Type Options

Target Options

Export from different sources:

Common Use Cases

Use Case 1: Export to COCO Format

For training with frameworks that use COCO format:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "COCO",
        "export_dir": {"absolute_path": "/path/to/coco_export"},
        "label_field": "ground_truth"
    }
)

Output structure:

coco_export/
├── data/
│   ├── image1.jpg
│   └── image2.jpg
└── labels.json

Use Case 2: Export to YOLO Format

For training YOLOv5/v8 models:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "YOLOv5",
        "export_dir": {"absolute_path": "/path/to/yolo_export"},
        "label_field": "ground_truth"
    }
)

Output structure:

yolo_export/
├── images/
│   └── train/
│       └── image1.jpg
├── labels/
│   └── train/
│       └── image1.txt
└── dataset.yaml

Use Case 3: Export Filtered View

Export only a subset of samples:

# Set context
set_context(dataset_name="my-dataset")

# Filter samples in the App
set_view(tags=["validated"])

# Export the filtered view
execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "target": "CURRENT_VIEW",
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "COCO",
        "export_dir": {"absolute_path": "/path/to/validated_export"},
        "label_field": "ground_truth"
    }
)

Use Case 4: Export Labels Only

When media should stay in place:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "LABELS_ONLY",
        "dataset_type": "COCO",
        "labels_path": {"absolute_path": "/path/to/annotations.json"},
        "label_field": "ground_truth"
    }
)

Use Case 5: Export for Classification Training

For image classification datasets:

set_context(dataset_name="my-classification-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "Image Classification Directory Tree",
        "export_dir": {"absolute_path": "/path/to/classification_export"},
        "label_field": "ground_truth"
    }
)

Output structure:

classification_export/
├── cat/
│   ├── cat1.jpg
│   └── cat2.jpg
└── dog/
    ├── dog1.jpg
    └── dog2.jpg

Use Case 6: Export to CSV

For analysis in spreadsheets:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "LABELS_ONLY",
        "dataset_type": "CSV",
        "labels_path": {"absolute_path": "/path/to/data.csv"},
        "csv_fields": ["filepath", "ground_truth.detections.label"]
    }
)

Use Case 7: Export FiftyOne Dataset (Full Backup)

For complete dataset backup including all metadata:

set_context(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/io/export_samples",
    params={
        "export_type": "MEDIA_AND_LABELS",
        "dataset_type": "FiftyOne Dataset",
        "export_dir": {"absolute_path": "/path/to/backup"}
    }
)

Output structure:

backup/
├── metadata.json
├── samples.json
├── data/
│   └── ...
├── annotations/
├── brain/
└── evaluations/

Python SDK Alternative

For more control, guide users to use the Python SDK directly:

import fiftyone as fo
import fiftyone.types as fot

# Load dataset
dataset = fo.load_dataset("my-dataset")

# Export to COCO format
dataset.export(
    export_dir="/path/to/export",
    dataset_type=fot.COCODetectionDataset,
    label_field="ground_truth",
)

# Export labels only
dataset.export(
    labels_path="/path/to/labels.json",
    dataset_type=fot.COCODetectionDataset,
    label_field="ground_truth",
)

# Export a filtered view
view = dataset.match_tags("validated")
view.export(
    export_dir="/path/to/validated",
    dataset_type=fot.YOLOv5Dataset,
    label_field="ground_truth",
)

Python SDK dataset types:

fot.COCODetectionDataset - COCO format
fot.YOLOv4Dataset - YOLOv4 format
fot.YOLOv5Dataset - YOLOv5 format
fot.VOCDetectionDataset - Pascal VOC format
fot.KITTIDetectionDataset - KITTI format
fot.CVATImageDataset - CVAT image format
fot.CVATVideoDataset - CVAT video format
fot.TFObjectDetectionDataset - TensorFlow Object Detection format
fot.ImageClassificationDirectoryTree - Classification folder structure
fot.VideoClassificationDirectoryTree - Video classification folders
fot.TFImageClassificationDataset - TensorFlow classification format
fot.ImageSegmentationDirectory - Segmentation masks
fot.CSVDataset - CSV format
fot.GeoJSONDataset - GeoJSON format
fot.FiftyOneDataset - Native FiftyOne format

Exporting to Hugging Face Hub

For complete HF Hub export documentation, see HF-HUB-EXPORT.md.

Quick reference:

| Method | Use Case | |--------|----------| | push_to_hub() | Personal accounts, simple upload | | Manual upload | Organizations, private org repos |

Quick start:

from fiftyone.utils.huggingface import push_to_hub

# Personal account
push_to_hub(dataset, repo_name="my-dataset", private=False)

# With options
push_to_hub(
    dataset,
    repo_name="my-dataset",
    description="My dataset description",
    license="apache-2.0",
    private=True,
)

Troubleshooting

Error: "Export directory already exists"

Add "overwrite": true to params
Or specify a different export directory

Error: "Label field not found"

Use dataset_summary() to see available label fields
Verify the field name spelling

Error: "Unsupported label type for format"

Check that the export format supports your label type
COCO: detections, segmentations, keypoints
YOLO: detections only
Classification formats: classification labels only

Error: "Permission denied"

Verify write permissions for the export directory
Check parent directory exists

Export is slow

Large datasets take time; consider exporting a view first
Export to local disk rather than network drives
For labels only, use LABELS_ONLY export type

Best Practices

Understand your data first - Use dataset_summary() to know what fields and label types exist
Match format to purpose - Use COCO/YOLO for training, CSV for analysis, FiftyOne Dataset for backups
Confirm with user - Present export settings before executing
Export filtered views - Only export what's needed rather than entire datasets
Verify after export - Check exported file counts match expectations
Use labels_path for LABELS_ONLY - When exporting labels only, use labels_path not export_dir

Resources

FiftyOne Export Guide
Supported Export Formats
FiftyOne I/O Plugin
FiftyOne Hugging Face Integration
Hugging Face Hub Documentation

Adoption

datamonsterr/fiftyone-dataset-export

$ install --global

Security Scan Results

SKILL.md

Export FiftyOne Datasets

Key Directives

1. Load and understand the dataset first

2. Confirm export settings with user

3. Match format to label types

4. Use absolute paths

5. Warn about overwriting

Complete Workflow

Step 1: Load Dataset and Understand Content

Step 2: Get Export Operator Schema

Step 3: Present Export Options to User

Step 4: Execute Export

Step 5: Verify Export

Supported Export Formats

Detection Formats

Classification Formats

Segmentation Formats

General Formats

Export Type Options

Target Options

Common Use Cases

Use Case 1: Export to COCO Format

Use Case 2: Export to YOLO Format

Use Case 3: Export Filtered View

Use Case 4: Export Labels Only

Use Case 5: Export for Classification Training

Use Case 6: Export to CSV

Use Case 7: Export FiftyOne Dataset (Full Backup)

Python SDK Alternative

Exporting to Hugging Face Hub

Troubleshooting

Best Practices

Resources

Related Skills

datamonsterr/segment-anything-model

datamonsterr/python-testing-patterns

datamonsterr/mcp-builder

datamonsterr/image-processing

datamonsterr/fiftyone-dataset-export

$ install --global

Security Scan Results

SKILL.md

Export FiftyOne Datasets

Key Directives

1. Load and understand the dataset first

2. Confirm export settings with user

3. Match format to label types

4. Use absolute paths

5. Warn about overwriting

Complete Workflow

Step 1: Load Dataset and Understand Content

Step 2: Get Export Operator Schema

Step 3: Present Export Options to User

Step 4: Execute Export

Step 5: Verify Export

Supported Export Formats

Detection Formats

Classification Formats

Segmentation Formats

General Formats

Export Type Options

Target Options

Common Use Cases

Use Case 1: Export to COCO Format

Use Case 2: Export to YOLO Format

Use Case 3: Export Filtered View

Use Case 4: Export Labels Only

Use Case 5: Export for Classification Training

Use Case 6: Export to CSV

Use Case 7: Export FiftyOne Dataset (Full Backup)

Python SDK Alternative

Exporting to Hugging Face Hub

Troubleshooting

Best Practices

Resources