The Data Infrastructure for Physical AI
The world's leading AI researchers are building world models and spatial intelligence. They need high-fidelity 3D training data from real environments. DreamVu's omnidirectional capture platform delivers it at scale.
Starts with Real-World Data
Spatial Intelligence Is the Next Frontier
Fei-Fei Li — the Stanford professor who created ImageNet and catalyzed the deep learning revolution — has made spatial intelligence the focus of her latest company, World Labs. Her thesis: AI must learn to perceive, reason about, and act in three-dimensional space. Not from text. Not from flat images. From spatially rich, real-world data.
"Spatial intelligence is the next major capability AI needs to develop. It's how humans and animals make sense of the world — and it's what's missing from today's AI systems."
Yann LeCun — Meta's Chief AI Scientist and Turing Award winner — has been equally direct. He argues that the path to truly intelligent machines runs through world models: internal representations of how the physical world works, learned from observation, not text.
"A system trained on text will never understand the physical world. You need world models — learned from video and sensory data — that can predict what happens next."
Both visions share a common prerequisite: massive amounts of high-fidelity, spatially aware, real-world 3D data. And that's exactly what doesn't exist today — at least, not at the scale or quality these models demand.
World Models Need Real Worlds
VLA (Vision-Language-Action) models can't learn physics, spatial relationships, or manipulation skills from 2D images and text. They need dense 3D captures of real environments with real people performing real tasks.
The Data Bottleneck Is Critical
Billions have been poured into model architectures — GR00T, RT-2, Octo, π₀ — but the training data barely exists. Open-source robotics datasets are small, narrow-FOV, and lack the 3D spatial richness these models require.
DreamVu Fills the Gap
Our dual-stream capture system — Alia 360° exocentric + GoPro egocentric — produces exactly the data that world models and spatial AI systems need. Synchronized omnidirectional capture with depth + RGB at scale.
The Perspective Problem
Humanoids need egocentric views (what they see) and exocentric views (how they appear to others). Traditional capture misses half the picture. DreamVu's synchronized dual-stream capture — Alia 360° exocentric + GoPro egocentric — gives you both simultaneously.
The Multimodal Gap
Physical AI models need vision + language + action data together. Most datasets provide vision only — leaving teams to stitch together incomplete signals. DreamVu delivers all three, synchronized.
The Sim-to-Real Gap
Humanoids trained in simulation fail when deployed in real environments. DreamVu captures the real world in formats that translate directly into Isaac Sim and back — closing the sim-to-real loop.
Dual-Stream Capture
Synchronized Alia 360° exocentric + GoPro egocentric cameras with full RGB + depth in real environments
Multimodal Annotation
AI-assisted (SAM2, Grounding DINO) + human QA delivers vision, language, and action labels — 10× faster than traditional 3D annotation
3D Reconstruction
3D Gaussian Splatting creates photorealistic scenes with all annotations preserved — ready for simulation conversion
Simulation Conversion
Automated USD export with physics properties for NVIDIA Isaac Sim
Synthetic Generation
1,000+ frames/hour with domain randomization — all modalities and skill transfer demos preserved
Real-World Validation
Continuous verification: sim-to-real transfer rates, manipulation success, and skill transfer effectiveness
Vision Data
- Synchronized ego + 360° exo video with depth
- Object segmentation with instance IDs
- 6DOF object poses
- Manipulation affordances (grip types, approach vectors)
- Human and robot demonstrations in 360° view
Language Data
- QA pairs describing objects, actions, and scene elements
- Action summaries for every sequence
- Spatial relations between objects and actors
- Context descriptions for scene understanding
- Ready for VLA instruction following
Action Data
- Full trajectories for every actor in 360° scene
- Movement paths with timestamps
- Interaction sequences showing manipulation
- Demonstration labels for skill transfer
- Kinematic data where available
Alia Specs
Purpose-Built for Physical AI Data
Developed from breakthrough research at IIIT Hyderabad (published at CVPR 2016) and refined over 8 years of production deployment. The synchronized dual-stream system — Alia 360° omnidirectional plus GoPro egocentric — captures the complete spatial context that humanoid robots need. This proprietary technology creates a defensible moat: we capture data that no other company can replicate.
Complete Scene Coverage
Traditional cameras: 60-90° FOV. DreamVu: 360° — captures everything simultaneously, no blind spots, no repositioning.
Skill Transfer at Scale
When multiple humans and robots demonstrate tasks throughout an environment, one Alia captures all demonstrations happening anywhere in the space — no repositioning required.
3D Gaussian Splatting
The 360° coverage provides ideal input for photorealistic 3D reconstruction. All multimodal annotations propagate automatically from 2D frames to the 3D scene.
32+ Patents
Protected omnidirectional 3D vision technology with 8+ years of production deployment. A defensible competitive advantage that ensures unique data capture capabilities.
5 Papers Proving the Approach
Our publications demonstrate that DreamVu's dual-stream 360° datasets produce measurable improvements to leading robot foundation models — with results scaling predictably from 200 to 500+ hours.
Cosmos-Reason2 Improvements
GR00T Manipulation Improvements
Cosmos Extended Results
GR00T Extended Results
Real-Sim-Real Transfer
Grocery Retail
Why Grocery?
A grocery store contains more distinct manipulation tasks per square foot than almost any other environment — making it the ideal proving ground for Physical AI.
Unmatched Skill Density
Picking, placing, stacking, scanning, bagging, mopping, organizing — 500+ distinct skills captured across customer, staff, and logistics operations.
Massive Market Pull
Autonomous restocking and checkout are among the highest-demand use cases for humanoid robots, targeting the $22B machine vision in retail market.
Transferable Complexity
If a VLA model can handle a cluttered grocery aisle with customers, carts, and staff in motion, it transfers to warehouses, fulfillment centers, and retail at large.
Open Teaser on Hugging Face
A curated 20–30 hour subset available in LeRobot format — try before you buy, benchmark against your existing training data.
🤗 Hugging Face & AGIBOT Challenge
A curated sample dataset in LeRobot v3.0 format — compatible with the AGIBOT World Challenge 2026 pipeline. CC BY 4.0 licensed. Integrate into your training run in five lines of code.
🟢 NVIDIA Cosmos & GR00T
Sample datasets available in Isaac Sim-native USD format. Fine-tune Cosmos world models or GR00T manipulation models with DreamVu's enriched 360° spatial data — benchmark against your existing training pipeline.
Catalog License
- Full 1,000-hour datasets
- All annotation layers included
- Isaac Sim + LeRobot + Open X formats
- Annual license, non-exclusive
- Exclusive option available at premium
Custom Capture
- On-site dual-stream 360° capture
- Full annotation pipeline included
- Typical: 1,000 hrs across 3–10 sites
- Minimum engagement: 500 hours
- Travel & accommodation at cost
Developer Kit
- $10,000 Starter Kit (Alia camera + training)
- $200/hr processing service
- Full annotation pipeline output
- LeRobot + Isaac Sim + Open X delivery
- Ideal for research labs & universities
Isaac Sim-native USD scenes, GR00T training pipeline integration, Isaac Lab compatibility, and Omniverse support
Open teaser dataset in LeRobot RLDS format — discoverable by the global research community
Full Open X-Embodiment compatibility — seamless integration with existing VLA training pipelines
Physical AI Infrastructure
Sashi Reddi
Managing Partner at SRI Capital. Founder & former CEO of AppLabs (acquired by CSC). PhD Wharton, MS NYU, BTech IIT Delhi.
Rajat Aggarwal
BTech & Masters in CSE with specialization in Computational Photography from IIIT Hyderabad. His CVPR'16 paper on computational cameras became the seed for DreamVu.
Dr. Anoop Namboodiri
Professor at IIIT Hyderabad. 75+ published papers. Built systems currently deployed at massive scale.
Parikshit Sakurikar
PhD in Computational Photography from IIIT Hyderabad. Eight years focused on ML, high-performance computing for CV.
Help Us Capture the Real World
Become a DreamVu Data Partner and earn while contributing to the largest Physical AI dataset ever created. Purchase an Alia Starter Kit, capture footage in your environment, and earn $50/hour for every accepted hour of video.
Purchase Kit
Acquire your Alia Starter Kit with full training and support materials
Complete Onboarding
Multi-session training on capture methodology and quality standards
Get Approved
Pass quality verification on your first 100 hours of capture
Capture & Earn
Scale to 500+ hours annually and earn recurring revenue
Frequently Asked Questions
What is DreamVu's omnidirectional capture platform?+
DreamVu's platform uses a proprietary dual-stream capture system combining the Alia 360° omnidirectional camera for full scene context with a GoPro egocentric camera for hand and object interaction detail. This provides the rich, multi-perspective 3D data that world models and humanoid robots need for training.
How much training data does DreamVu provide?+
DreamVu is building the largest dual-stream 3D manipulation dataset, targeting 16,000+ hours by December 2026 covering 500+ distinct skills across diverse real-world environments including kitchens, warehouses, retail spaces, and manufacturing floors.
What AI frameworks is DreamVu data compatible with?+
DreamVu data is compatible with NVIDIA Cosmos and GR00T for world model and humanoid robot training, and is formatted for the AGIBOT World Challenge using LeRobot v3.0 format with CC BY 4.0 licensing. Sample datasets are available on Hugging Face and NVIDIA platforms.
How can I become a DreamVu Data Partner?+
DreamVu's Data Partner Program provides a $10,000 Starter Kit including an Alia 360° camera, GoPro, and calibration tools. Partners earn $50 per accepted hour of captured data. Apply through the website to join the growing network targeting 25 partners by end of 2027.
What has DreamVu's research demonstrated?+
DreamVu's first two research papers, publishing in March 2026, demonstrate approximately 20% improvement in NVIDIA Cosmos world model performance and a similar 20% improvement in NVIDIA GR00T humanoid robot training when using DreamVu's dual-stream omnidirectional dataset compared to standard training data.