See Everything. Capture Everything. Train Everything.

The Data Infrastructure for Physical AI

The world's leading AI researchers are building world models and spatial intelligence. They need high-fidelity 3D training data from real environments. DreamVu's omnidirectional capture platform delivers it at scale.

16,000+
Hours by Dec 2026
500+
Distinct Skills
5
Research Papers
360° Omnidirectional Capture
The Race to Build World Models
Starts with Real-World Data
The most influential minds in AI have converged on one conclusion: the next frontier isn't larger language models — it's machines that understand and interact with the physical world.

Spatial Intelligence Is the Next Frontier

Fei-Fei Li — the Stanford professor who created ImageNet and catalyzed the deep learning revolution — has made spatial intelligence the focus of her latest company, World Labs. Her thesis: AI must learn to perceive, reason about, and act in three-dimensional space. Not from text. Not from flat images. From spatially rich, real-world data.

"Spatial intelligence is the next major capability AI needs to develop. It's how humans and animals make sense of the world — and it's what's missing from today's AI systems."

— Fei-Fei Li, Stanford HAI & World Labs

Yann LeCun — Meta's Chief AI Scientist and Turing Award winner — has been equally direct. He argues that the path to truly intelligent machines runs through world models: internal representations of how the physical world works, learned from observation, not text.

"A system trained on text will never understand the physical world. You need world models — learned from video and sensory data — that can predict what happens next."

— Yann LeCun, Meta AI & NYU

Both visions share a common prerequisite: massive amounts of high-fidelity, spatially aware, real-world 3D data. And that's exactly what doesn't exist today — at least, not at the scale or quality these models demand.

🌍

World Models Need Real Worlds

VLA (Vision-Language-Action) models can't learn physics, spatial relationships, or manipulation skills from 2D images and text. They need dense 3D captures of real environments with real people performing real tasks.

⚠️

The Data Bottleneck Is Critical

Billions have been poured into model architectures — GR00T, RT-2, Octo, π₀ — but the training data barely exists. Open-source robotics datasets are small, narrow-FOV, and lack the 3D spatial richness these models require.

🎯

DreamVu Fills the Gap

Our dual-stream capture system — Alia 360° exocentric + GoPro egocentric — produces exactly the data that world models and spatial AI systems need. Synchronized omnidirectional capture with depth + RGB at scale.

Why Training Physical AI Is So Hard
Humanoid robots don't just navigate — they manipulate objects, coordinate limbs, understand context, and learn from watching others. Current data falls short.
👁️

The Perspective Problem

Humanoids need egocentric views (what they see) and exocentric views (how they appear to others). Traditional capture misses half the picture. DreamVu's synchronized dual-stream capture — Alia 360° exocentric + GoPro egocentric — gives you both simultaneously.

🧩

The Multimodal Gap

Physical AI models need vision + language + action data together. Most datasets provide vision only — leaving teams to stitch together incomplete signals. DreamVu delivers all three, synchronized.

🔄

The Sim-to-Real Gap

Humanoids trained in simulation fail when deployed in real environments. DreamVu captures the real world in formats that translate directly into Isaac Sim and back — closing the sim-to-real loop.

From Real World to Training Pipeline
Our end-to-end platform transforms real-world captures into VLA-ready training datasets — delivered in Isaac Sim, LeRobot, and Open X-Embodiment formats.
1

Dual-Stream Capture

Synchronized Alia 360° exocentric + GoPro egocentric cameras with full RGB + depth in real environments

2

Multimodal Annotation

AI-assisted (SAM2, Grounding DINO) + human QA delivers vision, language, and action labels — 10× faster than traditional 3D annotation

3

3D Reconstruction

3D Gaussian Splatting creates photorealistic scenes with all annotations preserved — ready for simulation conversion

4

Simulation Conversion

Automated USD export with physics properties for NVIDIA Isaac Sim

5

Synthetic Generation

1,000+ frames/hour with domain randomization — all modalities and skill transfer demos preserved

6

Real-World Validation

Continuous verification: sim-to-real transfer rates, manipulation success, and skill transfer effectiveness

Vision Data

  • Synchronized ego + 360° exo video with depth
  • Object segmentation with instance IDs
  • 6DOF object poses
  • Manipulation affordances (grip types, approach vectors)
  • Human and robot demonstrations in 360° view

Language Data

  • QA pairs describing objects, actions, and scene elements
  • Action summaries for every sequence
  • Spatial relations between objects and actors
  • Context descriptions for scene understanding
  • Ready for VLA instruction following

Action Data

  • Full trajectories for every actor in 360° scene
  • Movement paths with timestamps
  • Interaction sequences showing manipulation
  • Demonstration labels for skill transfer
  • Kinematic data where available
🟢 NVIDIA Isaac Sim Native USD
🤗 Hugging Face LeRobot RLDS
📦 Open X-Embodiment
Dual-Stream Capture: Our Defensible Moat
Alia 360° omnidirectional camera + GoPro egocentric camera — synchronized dual-stream capture with 32+ patents, 8 years of production deployment.

Alia Specs

Full 360° coverage with long-range, high-resolution 3D depth in a single compact unit. No stitching, no blind spots, no multi-sensor calibration.
360°
Horizontal FOV
6912×3072
Stereo Resolution
120m
Detection Range
30fps
Frame Rate
Image & Depth
Vertical FOV170°
Depth Range0cm – 20m (no blind spot)
Depth Accuracy7mm at 5 meters
OutputReal-time RGB + Depth
Edge AI & Durability
ProcessingEmbedded edge AI
Human Detection120m range
Facial Recognition40m range
Ingress RatingIP67
CompatibilityUbuntu, ROS, OpenCV

Purpose-Built for Physical AI Data

Developed from breakthrough research at IIIT Hyderabad (published at CVPR 2016) and refined over 8 years of production deployment. The synchronized dual-stream system — Alia 360° omnidirectional plus GoPro egocentric — captures the complete spatial context that humanoid robots need. This proprietary technology creates a defensible moat: we capture data that no other company can replicate.

🔵

Complete Scene Coverage

Traditional cameras: 60-90° FOV. DreamVu: 360° — captures everything simultaneously, no blind spots, no repositioning.

👥

Skill Transfer at Scale

When multiple humans and robots demonstrate tasks throughout an environment, one Alia captures all demonstrations happening anywhere in the space — no repositioning required.

🔷

3D Gaussian Splatting

The 360° coverage provides ideal input for photorealistic 3D reconstruction. All multimodal annotations propagate automatically from 2D frames to the 3D scene.

🛡️

32+ Patents

Protected omnidirectional 3D vision technology with 8+ years of production deployment. A defensible competitive advantage that ensures unique data capture capabilities.

5 Papers Proving the Approach

Our publications demonstrate that DreamVu's dual-stream 360° datasets produce measurable improvements to leading robot foundation models — with results scaling predictably from 200 to 500+ hours.

NVIDIA Cosmos

Cosmos-Reason2 Improvements

200hr training data · March 2026
Domain-specific fine-tuning of Cosmos-Reason2 from DreamVu dual-stream retail data demonstrates measurable VLM performance gains across spatial, temporal, and action knowledge dimensions.
Paper available soon →
NVIDIA GR00T

GR00T Manipulation Improvements

200hr training data · March 2026
DreamVu training data improves GR00T manipulation performance in structured deployment environments through complementary egocentric + exocentric training signal.
Paper available soon →
NVIDIA Cosmos

Cosmos Extended Results

500hr training data · April 2026
Extension showing that Cosmos-Reason2 improvements scale predictably from 200 to 500 hours, establishing a data scaling curve with practical curriculum design guidelines.
Coming April 2026 →
NVIDIA GR00T

GR00T Extended Results

500hr training data · April 2026
GR00T manipulation improvements at 500-hour scale, with ablation on the independent contribution of egocentric vs. exocentric vs. combined data.
Coming April 2026 →
Isaac Sim

Real-Sim-Real Transfer

500hr training data · May 2026
DreamVu's retail dataset processed through Isaac Sim and back to physical robot deployment, demonstrating zero-shot sim-to-real transfer.
Coming May 2026 →
16,000+ Hours by December 2026
Starting with 1,000 hours of grocery retail — scaling to 16 complete datasets across India and the US through 5 internal capture rigs.

Grocery Retail

1,000
Hours of Enriched 3D Video
Capture Locations 5 Operational Stores
Distinct Skills 500+
Capture Technology Alia 360° Depth + RGB
Annotation Layers Occupancy, Semantic, Skills, Interactions
Formats Isaac Sim · LeRobot · Open X

Why Grocery?

A grocery store contains more distinct manipulation tasks per square foot than almost any other environment — making it the ideal proving ground for Physical AI.

Unmatched Skill Density

Picking, placing, stacking, scanning, bagging, mopping, organizing — 500+ distinct skills captured across customer, staff, and logistics operations.

Massive Market Pull

Autonomous restocking and checkout are among the highest-demand use cases for humanoid robots, targeting the $22B machine vision in retail market.

Transferable Complexity

If a VLA model can handle a cluttered grocery aisle with customers, carts, and staff in motion, it transfers to warehouses, fulfillment centers, and retail at large.

Open Teaser on Hugging Face

A curated 20–30 hour subset available in LeRobot format — try before you buy, benchmark against your existing training data.

🤗 Hugging Face & AGIBOT Challenge

A curated sample dataset in LeRobot v3.0 format — compatible with the AGIBOT World Challenge 2026 pipeline. CC BY 4.0 licensed. Integrate into your training run in five lines of code.

LeRobot v3.0 format CC BY 4.0 license AGIBOT task taxonomy aligned Integration notebook included

🟢 NVIDIA Cosmos & GR00T

Sample datasets available in Isaac Sim-native USD format. Fine-tune Cosmos world models or GR00T manipulation models with DreamVu's enriched 360° spatial data — benchmark against your existing training pipeline.

Isaac Sim / USD native Cosmos-Reason2 compatible GR00T training-ready Full annotation stack included
Simple Per-Hour Pricing
Every hour includes the full annotation stack: 360° 3D capture, occupancy maps, semantic labels, skill segmentation, and Isaac Sim-native delivery.

Custom Capture

$1,000/hr
Your environment, our methodology. DreamVu deploys rigs and operators to capture bespoke datasets.
  • On-site dual-stream 360° capture
  • Full annotation pipeline included
  • Typical: 1,000 hrs across 3–10 sites
  • Minimum engagement: 500 hours
  • Travel & accommodation at cost
Request a Proposal

Developer Kit

$10Kkit + $200/hr
Capture your own footage with an Alia Starter Kit. Send it to DreamVu for processing at $200/hr (~91% margin).
  • $10,000 Starter Kit (Alia camera + training)
  • $200/hr processing service
  • Full annotation pipeline output
  • LeRobot + Isaac Sim + Open X delivery
  • Ideal for research labs & universities
Learn More
Built for the Platforms You Use
DreamVu data integrates natively with the leading robotics AI platforms and training pipelines.

Isaac Sim-native USD scenes, GR00T training pipeline integration, Isaac Lab compatibility, and Omniverse support

Open teaser dataset in LeRobot RLDS format — discoverable by the global research community

Full Open X-Embodiment compatibility — seamless integration with existing VLA training pipelines

From Breakthrough Research to
Physical AI Infrastructure
DreamVu began with breakthrough research in computational imaging at IIIT Hyderabad — a new optical design for capturing 360° stereoscopic video in a single shot, published at CVPR. Eight years of production deployment later, we're now building the data infrastructure the humanoid robotics industry needs.
SR

Sashi Reddi

Co-Founder & Chairman

Managing Partner at SRI Capital. Founder & former CEO of AppLabs (acquired by CSC). PhD Wharton, MS NYU, BTech IIT Delhi.

RA

Rajat Aggarwal

Co-Founder & CEO

BTech & Masters in CSE with specialization in Computational Photography from IIIT Hyderabad. His CVPR'16 paper on computational cameras became the seed for DreamVu.

AN

Dr. Anoop Namboodiri

Co-Founder & Chief Science Officer

Professor at IIIT Hyderabad. 75+ published papers. Built systems currently deployed at massive scale.

PS

Parikshit Sakurikar

Co-Founder & VP Imaging & AI

PhD in Computational Photography from IIIT Hyderabad. Eight years focused on ML, high-performance computing for CV.

SRI Capital
Ben Franklin Technology Partners
Broad Street Angels
Philadelphia, PA
US Headquarters
Hyderabad, India
R&D Center

Help Us Capture the Real World

Become a DreamVu Data Partner and earn while contributing to the largest Physical AI dataset ever created. Purchase an Alia Starter Kit, capture footage in your environment, and earn $50/hour for every accepted hour of video.

$50
Per Accepted Hour
$10K
Starter Kit Cost
2-3 Mo
Kit Payback Time
25+
Partners by 2027
1

Purchase Kit

Acquire your Alia Starter Kit with full training and support materials

2

Complete Onboarding

Multi-session training on capture methodology and quality standards

3

Get Approved

Pass quality verification on your first 100 hours of capture

4

Capture & Earn

Scale to 500+ hours annually and earn recurring revenue

Frequently Asked Questions

What is DreamVu's omnidirectional capture platform?+

DreamVu's platform uses a proprietary dual-stream capture system combining the Alia 360° omnidirectional camera for full scene context with a GoPro egocentric camera for hand and object interaction detail. This provides the rich, multi-perspective 3D data that world models and humanoid robots need for training.

How much training data does DreamVu provide?+

DreamVu is building the largest dual-stream 3D manipulation dataset, targeting 16,000+ hours by December 2026 covering 500+ distinct skills across diverse real-world environments including kitchens, warehouses, retail spaces, and manufacturing floors.

What AI frameworks is DreamVu data compatible with?+

DreamVu data is compatible with NVIDIA Cosmos and GR00T for world model and humanoid robot training, and is formatted for the AGIBOT World Challenge using LeRobot v3.0 format with CC BY 4.0 licensing. Sample datasets are available on Hugging Face and NVIDIA platforms.

How can I become a DreamVu Data Partner?+

DreamVu's Data Partner Program provides a $10,000 Starter Kit including an Alia 360° camera, GoPro, and calibration tools. Partners earn $50 per accepted hour of captured data. Apply through the website to join the growing network targeting 25 partners by end of 2027.

What has DreamVu's research demonstrated?+

DreamVu's first two research papers, publishing in March 2026, demonstrate approximately 20% improvement in NVIDIA Cosmos world model performance and a similar 20% improvement in NVIDIA GR00T humanoid robot training when using DreamVu's dual-stream omnidirectional dataset compared to standard training data.

Ready to Train Physical AI
That Actually Works?

See how DreamVu's omnidirectional 3D data can accelerate your world model and spatial AI programs.