Data quality

Quality you can verify.

Six annotation layers, multi-stage QA, and full provenance on every dataset. Here’s exactly how the pipeline works.

Annotation layers

More than video

Every clip ships with six structured annotation layers — all inter-referenced, all QA-verified.

RGB Video

Multi-view + egocentric, calibrated intrinsics

Why it matters

Foundation layer — all other annotations reference the RGB frames. Calibrated camera models enable metric reconstruction.

How it’s produced

Captured with egocentric rigs at 24–60 fps. Camera intrinsics and extrinsics calibrated per device. Multi-view setups provide stereo depth reference.

Depth Maps

Per-frame metric depth, RGB-aligned

Why it matters

Spatial awareness for manipulation — grasp distance, object height, workspace geometry. Enables contact prediction and 3D trajectory learning.

How it’s produced

Depth estimation aligned to RGB frames using stereo or structured-light sensors. Metric scale. Holes and artifacts flagged in QA.

Pose Estimation

Full-body + hand/wrist keypoints, SMPL-compatible

Why it matters

Human motion imitation requires accurate joint-level representation. Hand keypoints are critical for dexterous manipulation policies.

How it’s produced

Full-body pose via lifting from egocentric video. Hand pose via dedicated hand tracker (42 keypoints, MANO-compatible). Confidence scores per keypoint.

Segmentation Masks

Object + scene segmentation, temporally consistent

Why it matters

Object-centric training, affordance learning, and scene understanding. Temporal consistency enables tracking-based training objectives.

How it’s produced

Instance and semantic segmentation on each frame. Temporal propagation for consistency. Object IDs maintained across clips.

Contact Timing

Grasp initiation, release, and contact-state labels

Why it matters

Contact-rich manipulation requires knowing exactly when and how the hand contacts the object — pre-grasp, contact, grasp, release phases.

How it’s produced

Annotated via combination of hand pose signals and reviewer marking. Per-frame contact state label: none / pre-contact / contact / grasp / release.

Structured Metadata

Environment, task, embodiment, QA score — per clip

Why it matters

Enables filtering, curriculum learning, and dataset management. Task-level labels connect clips to training objectives.

How it’s produced

Annotated at capture time and reviewed post-annotation. Labels: environment category, task type, embodiment type, QA score, capture date.

QA methodology

Four-stage review before delivery

No clip reaches delivery without passing automated checks, human review, and a scoring threshold.

Automated check

Frame completeness, blur detection, annotation file integrity, metadata schema validation. Clips failing automated checks are flagged before human review.

Human review

Reviewer checks annotation accuracy, task completion, and content quality. Rejection criteria: incomplete task, misaligned annotations, out-of-spec content.

QA scoring

Each approved clip receives a per-layer accuracy score. Dataset-level QA score = weighted mean across layers and clips. Threshold enforced before delivery.

Final report

Per-dataset QA report generated: clip count, rejection rate, per-layer scores, provenance summary. Delivered alongside the dataset.

Example QA score

0.94

0.0threshold1.0

Clips scoring below the threshold are rejected before delivery. The exact threshold is agreed at scoping.

Provenance

Full audit trail on every dataset

Every dataset includes a provenance record — traceable, inspection-ready, no gaps.

Capture context

Capture date, environment category, and task type — recorded at time of capture.

Collector consent reference

Consent record ID for each collector. Verifiable without exposing personal identity.

Annotation pipeline version

Which annotation pipeline version processed each clip — reproducible and auditable.

QA record

Reviewer ID, review date, pass/fail outcome, and score for every reviewed clip.

Rights clearance status

Global rights clearance status per clip — confirmed before delivery.

Delivery manifest

Full manifest of files, checksums, and layer availability delivered with each dataset.

See what we can capture or request a sample to review annotation quality first-hand.

See capabilities Request a sample