Formats

Delivered to your pipeline.

Format-native means training-ready on arrival — no conversion step, no preprocessing overhead on your side. Format is specified and agreed at scoping, before capture begins.

Supported formats

Three native formats, one sample structure

We support the primary standards used in Physical AI research and development.

LeRobotv3.0

Hugging Face ecosystem · PyTorch-native

The Hugging Face LeRobot format is the emerging standard for robot learning datasets. Load directly with the datasets library — no custom data loaders required. Compatible with the full HF ecosystem: hub upload, versioning, model training pipelines.

LeRobot sample structuresample · illustrative

from datasets import load_dataset

ds = load_dataset("gaitlabs/kitchen-manipulation-001")

# Episode structure (per frame):
{
  "observation.image":       tensor(H, W, 3),     # RGB frame
  "observation.depth":       tensor(H, W),         # metric depth
  "observation.state":       tensor(14),            # joint positions
  "annotation.hand_pose":    tensor(42),            # MANO keypoints
  "annotation.segmentation": tensor(H, W),         # instance mask
  "annotation.qa_score":     float,                # clip-level score
  "annotation.contact":      int,                  # 0=none 1=pre 2=contact 3=grasp
  "action":                  tensor(7)
}

RLDSTF-native

TensorFlow · JAX · Reverb-compatible

RLDS (Robot Learning Dataset Specification) is designed for TensorFlow and JAX training pipelines. Native integration with tf.data, Reverb replay buffers, and distributed training. Used by major embodied AI research groups for large-scale pretraining.

RLDS sample structuresample · illustrative

import tensorflow_datasets as tfds

ds = tfds.load("gaitlabs_kitchen_001")

# Episode metadata:
{
  "episode_id":           str,
  "task":                 str,    # "manipulation/pick_place"
  "embodiment":           str,    # "custom_rig_v2"
  "environment":          str,    # "kitchen"
  "qa_score":             float,  # 0.94
  "n_frames":             int,
  "annotation_layers":    list,   # ["depth", "pose", "seg", "contact"]
  "rights_cleared":       bool,
  "gdpr_compliant":       bool
}

Open X-EmbodimentOXE

Cross-embodiment training · VLA pretraining

The Open X-Embodiment format enables training across multiple robot embodiments and data sources. Designed for foundation model pretraining — combine GaitLabs egocentric human data with robot demonstrations for richer pretraining signal.

Open X-Embodiment sample structuresample · illustrative

{
  "dataset_name":    "gaitlabs_kitchen_001",
  "embodiment":      "human",
  "modality":        ["vision", "depth", "proprioception"],
  "task_type":       "manipulation",
  "language_instruction": "pick up the red cup",
  "license":         "gaitlabs-commercial-v1",
  "format":          "open_x_embodiment",
  "annotation_layers": {
    "depth":         "per_frame_metric",
    "hand_pose":     "mano_42kp",
    "segmentation":  "instance_mask",
    "contact":       "phase_label"
  }
}

Additional formats

HDF5, zarr, WebDataset, and custom

HDF5

Hierarchical file format for large-scale numerical data. Supported for teams with existing HDF5 pipelines.

zarr

Cloud-native chunked array storage. Efficient for distributed access and large annotation arrays (masks, depth volumes).

WebDataset

Tar-based streaming format for high-throughput data loading. Suitable for very large datasets and distributed training.

Custom

Bespoke schema, field naming, or packaging? We agree format spec at scoping. If it's trainable, we can deliver it.

Don’t see your format? Custom schemas, field naming, and packaging are agreed at scoping. If your training pipeline can consume it, we can deliver it. Tell us what you need →

Format-native means the dataset arrives in the form your pipeline expects. No conversion step. No preprocessing overhead. Just load and train.

Ready to specify your dataset?

Format, volume, and annotation requirements are all agreed at scoping.

Request a sample