Data Products / Physical AI

Data Product · 05

Synchronized Data for Embodied Intelligence and Real-World Autonomy

Timestamp-aligned multi-sensor episodes — demonstrations, teleoperation, RGB-D, LiDAR, force and kinematics — engineered for VLA models, world models, and safety validation.

Use cases

What teams use it for.

09 items
Humanoid robotsHousehold roboticsIndustrial manipulationAutonomous vehiclesDronesSmart spacesWorld modelsVLA modelsSafety validation

What we build

Data we produce.

14 items
Robot demonstrationsTeleoperation trajectoriesEgocentric videoMulti-camera videoRGB-DLiDARRadarForce / torqueHand poseGazeBody kinematicsSensor fusionObject state annotationsTask phase annotations

Delivery

Delivery formats

07 items
MCAPROS bagJSONL episodesParquetMP4Point cloud formatsCustom sensor logs

Delivery & integration

Built to drop into your pipeline.

Every dataset ships versioned, documented, and matched to your schema — with a QA report your research team can audit against acceptance criteria.

Robot demonstrationsTeleoperation trajectoriesEgocentric videoMulti-camera videoRGB-D

Workflow

How the program runs.

  1. 01Task Design
  2. 02Environment Design
  3. 03Sensor Setup
  4. 04Operator Protocol
  5. 05Calibration
  6. 06Collection
  7. 07Annotation
  8. 08Validation
  9. 09Delivery

Quality controls

How we keep it correct.

  • Timestamp alignment
  • Drop-frame monitoring
  • Coverage tracking
  • Episode completeness
  • Embodiment consistency
  • Sensor calibration checks
  • Safety review
  • Environment diversity tracking

FAQ

Common questions.

What sensor modalities can you synchronize?

Multi-camera RGB, RGB-D, LiDAR, radar, force/torque, joint states, hand pose, gaze, and full-body kinematics — aligned to a common clock with drop-frame monitoring and calibration checks across every episode.

What formats do you deliver robotics data in?

MCAP, ROS bag, JSONL episode format, Parquet, MP4, and standard point cloud formats — or a custom schema matched to your training pipeline.

Do you run collection in custom environments?

Yes. We design tasks, environments, and operator protocols around your embodiment and deployment scenario, including household, industrial, and outdoor settings.

Product deep dive

Physical AI and Robotics Data for VLA Models and Embodied Intelligence

The Data Layer Behind Reliable Embodied Intelligence and Real-World Autonomy

Vision-language-action models are driving a shift from task-specific policies toward generalist robotic systems trained across tasks, environments, and embodiments. Their limiting resource is not a generic video corpus. It is high-quality interaction data that connects what the robot perceives, what it is asked to do, what action it takes, how the world changes, and whether the task succeeds safely.

The current data frontier mixes real teleoperation and demonstrations with simulation, synthetic variation, human or internet video, language enrichment, and failure-driven post-training. For these sources to work together, episodes need synchronized timestamps, calibrated sensors, explicit embodiment metadata, consistent action semantics, and controls that detect dropped frames, control lag, incomplete trajectories, and silent environment changes.

Our role is not to sell a fixed, generic dataset. We design a program around the target model, deployment environment, failure profile, data rights, and acceptance criteria. Every engagement begins with a concrete definition of what a usable training or evaluation unit means for the customer—and how that unit will be verified before delivery.

Built for Teams That Need More Than Volume

This product supports manipulation, humanoids, mobile robotics, drones, autonomous systems, smart environments, and physical-world foundation models. Buyers need repeatable collection operations, multi-sensor engineering, operator protocols, language and subtask annotations, cross-embodiment normalization, and evaluation that measures completion, robustness, safety, and recovery.

Common engagement triggers

  • Demonstration volume is increasing, but inconsistent calibration, action spaces, or task metadata prevents effective training.
  • The model succeeds in familiar lab setups but fails under new objects, backgrounds, lighting, layouts, or embodiments.
  • Logs contain video and controls but lack task language, subtask boundaries, object state, failure labels, or outcome verification.
  • Simulation or synthetic trajectories exist, but realism and transfer value are not measured against real data.
  • Near-misses and recovery episodes are discarded even though they contain high-value supervision.
  • The team needs standard, queryable delivery across ROS bags, MCAP, episode datasets, video, depth, point clouds, and custom logs.

What This Product Can Support

Robot Demonstrations and Teleoperation

Operators complete defined tasks while synchronized sensors, robot state, controls, and environment events are recorded. Protocols can target task diversity, difficult states, corrections, and consistent outcomes.

  • Single-arm, bimanual, mobile-manipulation, and humanoid episodes.
  • Position, velocity, effort, gripper, and action-command streams.
  • Teleoperation, kinesthetic teaching, or wearable/handheld interfaces.
  • Success, partial success, failure, intervention, and recovery episodes.
  • Operator and session quality monitoring.

Multi-Sensor Capture and Calibration

Robotics learning depends on clocks and coordinate frames. We design capture around intrinsic/extrinsic calibration, hardware timing, dropped-data detection, and replayable sensor manifests.

  • RGB, stereo, RGB-D, fisheye, event camera, and egocentric video.
  • LiDAR, radar, IMU, joint state, force/torque, tactile, gaze, and hand pose.
  • Timestamp synchronization and time-offset estimation.
  • Camera intrinsics, extrinsics, coordinate frames, and calibration provenance.
  • Health telemetry and frame/packet completeness monitoring.

Language, Subtask, and State Annotation

Language connects high-level intent to low-level action. Episode enrichment makes data useful for VLA training, retrieval, planning, evaluation, and failure analysis.

  • Task instructions and paraphrases.
  • Plans, subtasks, phase boundaries, and event-level language.
  • Object identity, pose, affordance, state, and relation labels.
  • Contact, grasp, release, collision, and safety events.
  • Natural-language corrections, interjections, and recovery descriptions.

Cross-Embodiment and Dataset Normalization

Pooling datasets requires explicit treatment of morphology, action representation, control rate, cameras, task definitions, and success criteria. We preserve source signals while providing normalized training views.

  • Embodiment and kinematic metadata.
  • Action-space adapters and canonical semantic actions.
  • Common episode, task, and feature schemas.
  • Source-aware sampling and leakage groups.
  • Normalization without discarding raw controls or calibration.

Simulation, Synthetic Data, and World-Model Workflows

Synthetic data can expand variation and target rare events, but it must be traceable and evaluated against real-world performance.

  • Simulation task and environment variation.
  • Synthetic trajectories, sensor streams, and motion data.
  • World-model rollouts and counterfactual scenarios.
  • Rare, unsafe, or expensive condition generation.
  • Real-to-sim and sim-to-real gap analysis.

Robotics Evaluation and Safety Validation

Evaluation must account for embodiment, task setup, trial protocol, reset quality, environmental variation, and safety. Success alone can hide brittle behavior or intervention dependence.

  • Task completion and subtask-level scoring.
  • Generalization across objects, scenes, users, and embodiments.
  • Collision, force, near-miss, timeout, and unsafe-state analysis.
  • Recovery from perturbation, perception error, or action failure.
  • Video- and telemetry-backed human adjudication.

Data We Build

The delivery unit is defined at the level required by the model and the evaluation harness—not merely as a row of text or a media file. Depending on the program, one record may include source inputs, structured intermediate state, human judgments, provenance, quality evidence, and model- or environment-derived verification.

DeliverableWhat it containsTypical use
Robot episode datasetInstructions, synchronized observations/actions, robot state, calibration, task metadata, outcome, and QA.VLA pretraining, imitation learning, policy fine-tuning.
Teleoperation demonstration setOperator controls, robot motion, sensor streams, interventions, and success/failure evidence.Skill learning, bimanual manipulation, dexterous control.
Language-enriched trajectory setTask paraphrases, plans, subtasks, event language, object states, corrections, and timestamps.Language-conditioned policy learning and hierarchical control.
Sensor-fusion corpusTime-aligned camera, depth, point cloud, radar, IMU, force, tactile, and state streams with calibration.Perception, world models, sensor fusion, and autonomy.
Failure and recovery libraryNear-misses, failures, perturbations, corrective actions, intervention, and root-cause labels.Robustness, recovery policies, safety evaluation, failure-driven post-training.
Robotics benchmark suiteTrial protocol, environment configurations, hidden variations, scoring, telemetry, video, and adjudication.Model comparison, release gating, deployment readiness.

Reference Record Design

A production schema is finalized during calibration, but a typical record may include the following fields:

  • episode_id — Stable identifier for one attempt, including source dataset and version.
  • embodiment — Robot platform, kinematics, end effectors, control mode, action space, rates, and software versions.
  • task_and_language — Task ID, instruction, paraphrases, object set, initial state, success criteria, plan, and subtasks.
  • environment — Site, scene, layout, lighting, surfaces, distractors, safety zone, and reset configuration.
  • timebase — Master clock, per-stream timestamps, synchronization method, offsets, and drift.
  • observations — Camera, depth, point cloud, audio, IMU, force/torque, tactile, robot state, and other sensors.
  • actions — Commands and executed state, including joint, Cartesian, gripper, locomotion, and semantic action labels.
  • events_and_annotations — Subtasks, contacts, grasps, state changes, collisions, interventions, and language events.
  • outcome_and_safety — Success, partial/failure reason, completion time, human intervention, near-miss, and safety checks.
  • quality_and_provenance — Calibration, completeness, dropped-data flags, operator/session metadata, source class, and transformations.

The schema is versioned. Changes to label definitions, evidence requirements, reviewer policy, or normalization rules are recorded so training and evaluation results can be traced to the exact specification used.

Program Workflow

  1. Task and deployment scoping. Define capabilities, embodiment, environments, object distributions, safety constraints, and success at task and subtask levels.
  2. Data architecture and instrumentation. Specify sensors, clocks, calibration, control logging, operator interface, file formats, storage, and real-time health monitoring.
  3. Environment and protocol design. Create reset procedures, object/layout variation, operator instructions, safety zones, failure injection, and coverage quotas.
  4. Calibration and dry runs. Validate coordinate frames, time alignment, control semantics, sensor quality, and replay before production collection.
  5. Demonstration and rollout collection. Capture successful, diverse, difficult, failed, and recovery episodes with operator/session monitoring and immediate QC.
  6. Enrichment and annotation. Add language, subtasks, object/state events, contacts, interventions, safety events, and outcome evidence using synchronized playback.
  7. Validation and coverage analysis. Check completeness, offsets, calibration, action-observation consistency, episode boundaries, outcomes, and distribution targets.
  8. Model-in-the-loop iteration. Train or evaluate early releases, mine weak states and generalization gaps, collect targeted corrections, and version the next release.

A pilot is considered complete only when the customer and delivery team have aligned on the rubric, reviewed representative disagreements, validated the export, and confirmed that the data is useful in the intended training or evaluation loop.

Quality Controls

Quality is designed into the workflow rather than added as a final inspection step. The control plan depends on task ambiguity, domain risk, annotator expertise, and whether an item has an executable or external verifier.

  • Clock and synchronization audit: Record master timebase and per-stream offsets; detect drift and verify alignment through known events.
  • Calibration version control: Store intrinsics, extrinsics, frames, tooling, procedure, date, and validity checks for every session.
  • Episode completeness: Confirm required streams, action ranges, start/end conditions, reset state, and absence of silent corruption.
  • Embodiment consistency: Validate joint names, units, limits, control modes, end-effector state, frames, and software versions.
  • Outcome evidence: Pair success/failure labels with video, state, object placement, force, or a task-specific verifier.
  • Operator and session monitoring: Track learning effects, fatigue, style, intervention frequency, equipment changes, and anomalous sessions.
  • Failure preservation: Quarantine corrupt captures but retain valid failures and recoveries with root-cause labels.
  • Real/synthetic separation: Track source class, simulator, generation parameters, randomization, and validation so synthetic data remains auditable.

Recommended acceptance metrics

  • Episode acceptance rate: Share meeting required stream, calibration, protocol, and outcome-evidence criteria.
  • Synchronization quality: Maximum and distribution of cross-stream offsets and drift.
  • Task and variation coverage: Distribution across tasks, objects, layouts, operators, embodiments, environments, and difficulty.
  • Outcome reliability: Agreement between declared outcome and verification or adjudication.
  • Safety event rate: Collisions, force violations, near-misses, zone entry, and intervention by slice.
  • Model utility: Success, robustness, recovery, and generalization change on held-out real-world trials.

No single aggregate score is sufficient. Agreement can diagnose ambiguity, but high agreement does not by itself prove correctness; disagreement can reveal plural preferences, unclear policy, underspecified context, or difficult edge cases. The QA report therefore pairs quantitative measures with sampled error analysis and adjudication notes.

Delivery and Integration

Supported delivery patterns

  • Versioned batch delivery for controlled model-training releases.
  • Incremental delivery for active learning, post-training, or continuous evaluation.
  • Secure customer-workspace delivery when source data cannot leave the customer environment.
  • API- or object-storage-based transfer for high-volume or multimodal programs.
  • Evaluation-ready task packs with rubrics, reference evidence, and scoring logic.

Common formats

MCAP, ROS 2 bag, ROS 1 bag, LeRobotDataset v3, RLDS-style episodes, JSONL, Parquet, MP4, PNG, PLY/PCD, HDF5, Zarr, custom sensor logs

Raw lossless sensor logs can be delivered alongside normalized episode tables and model-ready training views. MCAP suits heterogeneous timestamped streams, while LeRobotDataset-style layouts support scalable episode/video storage and language annotations. Source-specific controls and calibration remain available even when a canonical schema is added.

Each release can include a dataset card or delivery memo, schema and ontology version, quality summary, known limitations, rights and consent metadata where applicable, and a machine-readable manifest with checksums and file-level lineage.

Security, Rights, and Governance

Physical-world data may capture employees, homes, facilities, production lines, faces, voices, locations, proprietary objects, and safety incidents. Collection requires site authorization, notice and consent where applicable, bystander procedures, access boundaries, export/location controls, and a plan for redaction or restricted review. Safety protocols govern operators and equipment independently of data targets.

Program controls may include role-based access, workspace isolation, least-privilege review queues, de-identification, retention limits, geographic routing, approved-tool restrictions, audit logs, and customer-defined deletion procedures. These controls are scoped contractually; the page does not imply a certification or regulatory status that has not been independently verified.

Engagement Models

EngagementBest forTypical output
Instrumentation and pilotTeams establishing a new collection stack.Architecture, calibration, protocol, pilot episodes, format validation, and QA report.
Managed demonstration programTeams scaling real-robot data.Operators, environments, synchronized episodes, enrichment, and recurring releases.
Dataset harmonizationTeams combining internal and public robot data.Source audit, canonical schema, adapters, quality filters, and sampling metadata.
Evaluation and failure flywheelTeams improving policies.Trial suite, failure library, corrective collection, and regression reporting.

Illustrative Program Shapes

The examples below are representative program patterns, not claims about named customers or guaranteed outcomes.

  1. Bimanual warehouse manipulation. Collect synchronized wrist and scene cameras, robot state, gripper and force streams, task language, object states, packing outcomes, and recovery from slips or occlusion.
  2. Humanoid household skills. Design diverse room layouts, objects, users, and instructions; record locomotion, manipulation, safety boundaries, interventions, and language-enriched subtasks.
  3. Autonomous sensor-fusion corpus. Capture calibrated camera, depth, LiDAR, radar, IMU, and robot state under environmental variations, with timestamp auditing and scenario annotations.
  4. Synthetic-to-real skill expansion. Generate controlled simulation variations from real demonstrations, track every synthetic parameter, and validate incremental value on held-out real trials.

Why a Custom Program

Off-the-shelf datasets are useful for baseline experimentation, but production systems usually fail at the boundaries: domain-specific policy, uncommon languages, tool or sensor state, difficult negative examples, ambiguous evidence, long-tail user behavior, and deployment-specific risk. A custom program makes those boundaries explicit and converts them into measurable data requirements.

The result is not simply “more labels.” It is a controlled data asset with a defined purpose, documented provenance, repeatable quality process, and a path from observed model failure to the next training or evaluation cycle.