See Everything. Capture Everything. Train Everything.

The Data Infrastructure for Physical AI

The world's leading AI researchers are building world models and spatial intelligence. They need high-fidelity 3D training data from real environments. DreamVu's omnidirectional capture platform delivers it at scale.

Explore the Dataset Talk to Us

16,000+

Hours by Dec 2026

500+

Distinct Skills

Research Papers

360° Omnidirectional Capture

Why Now

The Race to Build World Models
Starts with Real-World Data

The most influential minds in AI have converged on one conclusion: the next frontier isn't larger language models — it's machines that understand and interact with the physical world.

Spatial Intelligence Is the Next Frontier

Fei-Fei Li — the Stanford professor who created ImageNet and catalyzed the deep learning revolution — has made spatial intelligence the focus of her latest company, World Labs. Her thesis: AI must learn to perceive, reason about, and act in three-dimensional space. Not from text. Not from flat images. From spatially rich, real-world data.

"Spatial intelligence is the next major capability AI needs to develop. It's how humans and animals make sense of the world — and it's what's missing from today's AI systems."

— Fei-Fei Li, Stanford HAI & World Labs

Yann LeCun — Meta's Chief AI Scientist and Turing Award winner — has been equally direct. He argues that the path to truly intelligent machines runs through world models: internal representations of how the physical world works, learned from observation, not text.

"A system trained on text will never understand the physical world. You need world models — learned from video and sensory data — that can predict what happens next."

— Yann LeCun, Meta AI & NYU

Both visions share a common prerequisite: massive amounts of high-fidelity, spatially aware, real-world 3D data. And that's exactly what doesn't exist today — at least, not at the scale or quality these models demand.

🌍

World Models Need Real Worlds

VLA (Vision-Language-Action) models can't learn physics, spatial relationships, or manipulation skills from 2D images and text. They need dense 3D captures of real environments with real people performing real tasks.

⚠️

The Data Bottleneck Is Critical

Billions have been poured into model architectures — GR00T, RT-2, Octo, π₀ — but the training data barely exists. Open-source robotics datasets are small, narrow-FOV, and lack the 3D spatial richness these models require.

🎯

DreamVu Fills the Gap

Our dual-stream capture system — Alia 360° exocentric + GoPro egocentric — produces exactly the data that world models and spatial AI systems need. Synchronized omnidirectional capture with depth + RGB at scale.

The Challenge

Why Training Physical AI Is So Hard

Humanoid robots don't just navigate — they manipulate objects, coordinate limbs, understand context, and learn from watching others. Current data falls short.

👁️

The Perspective Problem

Humanoids need egocentric views (what they see) and exocentric views (how they appear to others). Traditional capture misses half the picture. DreamVu's synchronized dual-stream capture — Alia 360° exocentric + GoPro egocentric — gives you both simultaneously.

🧩

The Multimodal Gap

Physical AI models need vision + language + action data together. Most datasets provide vision only — leaving teams to stitch together incomplete signals. DreamVu delivers all three, synchronized.

🔄

The Sim-to-Real Gap

Humanoids trained in simulation fail when deployed in real environments. DreamVu captures the real world in formats that translate directly into Isaac Sim and back — closing the sim-to-real loop.

Platform

From Real World to Training Pipeline

Our end-to-end platform transforms real-world captures into VLA-ready training datasets — delivered in Isaac Sim, LeRobot, and Open X-Embodiment formats.

Dual-Stream Capture

Synchronized Alia 360° exocentric + GoPro egocentric cameras with full RGB + depth in real environments

Multimodal Annotation

AI-assisted (SAM2, Grounding DINO) + human QA delivers vision, language, and action labels — 10× faster than traditional 3D annotation

3D Reconstruction

3D Gaussian Splatting creates photorealistic scenes with all annotations preserved — ready for simulation conversion

Simulation Conversion

Automated USD export with physics properties for NVIDIA Isaac Sim

Synthetic Generation

1,000+ frames/hour with domain randomization — all modalities and skill transfer demos preserved

Real-World Validation

Continuous verification: sim-to-real transfer rates, manipulation success, and skill transfer effectiveness

Vision Data

Synchronized ego + 360° exo video with depth
Object segmentation with instance IDs
6DOF object poses
Manipulation affordances (grip types, approach vectors)
Human and robot demonstrations in 360° view

Language Data

QA pairs describing objects, actions, and scene elements
Action summaries for every sequence
Spatial relations between objects and actors
Context descriptions for scene understanding
Ready for VLA instruction following

Action Data

Full trajectories for every actor in 360° scene
Movement paths with timestamps
Interaction sequences showing manipulation
Demonstration labels for skill transfer
Kinematic data where available

🟢 NVIDIA Isaac Sim Native USD

🤗 Hugging Face LeRobot RLDS

📦 Open X-Embodiment

Technology

Dual-Stream Capture: Our Defensible Moat

Alia 360° omnidirectional camera + GoPro egocentric camera — synchronized dual-stream capture with 32+ patents, 8 years of production deployment.

Alia Specs

Full 360° coverage with long-range, high-resolution 3D depth in a single compact unit. No stitching, no blind spots, no multi-sensor calibration.

360°

Horizontal FOV

6912×3072

Stereo Resolution

120m

Detection Range

30fps

Frame Rate

Image & Depth

Vertical FOV170°

Depth Range0cm – 20m (no blind spot)

Depth Accuracy7mm at 5 meters

OutputReal-time RGB + Depth

Edge AI & Durability

ProcessingEmbedded edge AI

Human Detection120m range

Facial Recognition40m range

Ingress RatingIP67

CompatibilityUbuntu, ROS, OpenCV

Purpose-Built for Physical AI Data

Developed from breakthrough research at IIIT Hyderabad (published at CVPR 2016) and refined over 8 years of production deployment. The synchronized dual-stream system — Alia 360° omnidirectional plus GoPro egocentric — captures the complete spatial context that humanoid robots need. This proprietary technology creates a defensible moat: we capture data that no other company can replicate.

🔵

Complete Scene Coverage

Traditional cameras: 60-90° FOV. DreamVu: 360° — captures everything simultaneously, no blind spots, no repositioning.

👥

Skill Transfer at Scale

When multiple humans and robots demonstrate tasks throughout an environment, one Alia captures all demonstrations happening anywhere in the space — no repositioning required.

🔷

3D Gaussian Splatting

The 360° coverage provides ideal input for photorealistic 3D reconstruction. All multimodal annotations propagate automatically from 2D frames to the 3D scene.

🛡️

32+ Patents

Protected omnidirectional 3D vision technology with 8+ years of production deployment. A defensible competitive advantage that ensures unique data capture capabilities.

Research

5 Papers Proving the Approach

Our publications demonstrate that DreamVu's dual-stream 360° datasets produce measurable improvements to leading robot foundation models — with results scaling predictably from 200 to 500+ hours.

NVIDIA Cosmos

Cosmos-Reason2 Improvements

200hr training data · March 2026

Domain-specific fine-tuning of Cosmos-Reason2 from DreamVu dual-stream retail data demonstrates measurable VLM performance gains across spatial, temporal, and action knowledge dimensions.

Paper available soon →

NVIDIA GR00T

GR00T Manipulation Improvements

200hr training data · March 2026

DreamVu training data improves GR00T manipulation performance in structured deployment environments through complementary egocentric + exocentric training signal.

Paper available soon →

NVIDIA Cosmos

Cosmos Extended Results

500hr training data · April 2026

Extension showing that Cosmos-Reason2 improvements scale predictably from 200 to 500 hours, establishing a data scaling curve with practical curriculum design guidelines.

Coming April 2026 →

NVIDIA GR00T

GR00T Extended Results

500hr training data · April 2026

GR00T manipulation improvements at 500-hour scale, with ablation on the independent contribution of egocentric vs. exocentric vs. combined data.

Coming April 2026 →

Isaac Sim

Real-Sim-Real Transfer

500hr training data · May 2026

DreamVu's retail dataset processed through Isaac Sim and back to physical robot deployment, demonstrating zero-shot sim-to-real transfer.

Coming May 2026 →

Datasets

16,000+ Hours by December 2026

Starting with 1,000 hours of grocery retail — scaling to 16 complete datasets across India and the US through 5 internal capture rigs.

Grocery Retail

1,000

Hours of Enriched 3D Video

Capture Locations 5 Operational Stores

Distinct Skills 500+

Capture Technology Alia 360° Depth + RGB

Annotation Layers Occupancy, Semantic, Skills, Interactions

Formats Isaac Sim · LeRobot · Open X

Why Grocery?

A grocery store contains more distinct manipulation tasks per square foot than almost any other environment — making it the ideal proving ground for Physical AI.

Unmatched Skill Density

Picking, placing, stacking, scanning, bagging, mopping, organizing — 500+ distinct skills captured across customer, staff, and logistics operations.

Massive Market Pull

Autonomous restocking and checkout are among the highest-demand use cases for humanoid robots, targeting the $22B machine vision in retail market.

Transferable Complexity

If a VLA model can handle a cluttered grocery aisle with customers, carts, and staff in motion, it transfers to warehouses, fulfillment centers, and retail at large.

Open Teaser on Hugging Face

A curated 20–30 hour subset available in LeRobot format — try before you buy, benchmark against your existing training data.

Try the Dataset

🤗 Hugging Face & AGIBOT Challenge

A curated sample dataset in LeRobot v3.0 format — compatible with the AGIBOT World Challenge 2026 pipeline. CC BY 4.0 licensed. Integrate into your training run in five lines of code.

✓ LeRobot v3.0 format ✓ CC BY 4.0 license ✓ AGIBOT task taxonomy aligned ✓ Integration notebook included

View on Hugging Face

🟢 NVIDIA Cosmos & GR00T

Sample datasets available in Isaac Sim-native USD format. Fine-tune Cosmos world models or GR00T manipulation models with DreamVu's enriched 360° spatial data — benchmark against your existing training pipeline.

✓ Isaac Sim / USD native ✓ Cosmos-Reason2 compatible ✓ GR00T training-ready ✓ Full annotation stack included

Request NVIDIA Sample

Pricing

Simple Per-Hour Pricing

Every hour includes the full annotation stack: 360° 3D capture, occupancy maps, semantic labels, skill segmentation, and Isaac Sim-native delivery.

Catalog License

$500/hr

Non-exclusive annual license to enriched 3D datasets. Full annotation stack included.

Full 1,000-hour datasets
All annotation layers included
Isaac Sim + LeRobot + Open X formats
Annual license, non-exclusive
Exclusive option available at premium

Get Started

Custom Capture

$1,000/hr

Your environment, our methodology. DreamVu deploys rigs and operators to capture bespoke datasets.

On-site dual-stream 360° capture
Full annotation pipeline included
Typical: 1,000 hrs across 3–10 sites
Minimum engagement: 500 hours
Travel & accommodation at cost

Request a Proposal

Developer Kit

$10Kkit + $200/hr

Capture your own footage with an Alia Starter Kit. Send it to DreamVu for processing at $200/hr (~91% margin).

$10,000 Starter Kit (Alia camera + training)
$200/hr processing service
Full annotation pipeline output
LeRobot + Isaac Sim + Open X delivery
Ideal for research labs & universities

Learn More

Ecosystem

Built for the Platforms You Use

DreamVu data integrates natively with the leading robotics AI platforms and training pipelines.

NVIDIA

Isaac Sim-native USD scenes, GR00T training pipeline integration, Isaac Lab compatibility, and Omniverse support

🤗 Hugging Face

Open teaser dataset in LeRobot RLDS format — discoverable by the global research community

Open X-Embodiment

Full Open X-Embodiment compatibility — seamless integration with existing VLA training pipelines

About

From Breakthrough Research to
Physical AI Infrastructure

DreamVu began with breakthrough research in computational imaging at IIIT Hyderabad — a new optical design for capturing 360° stereoscopic video in a single shot, published at CVPR. Eight years of production deployment later, we're now building the data infrastructure the humanoid robotics industry needs.

Sashi Reddi

Co-Founder & Chairman

Managing Partner at SRI Capital. Founder & former CEO of AppLabs (acquired by CSC). PhD Wharton, MS NYU, BTech IIT Delhi.

Rajat Aggarwal

Co-Founder & CEO

BTech & Masters in CSE with specialization in Computational Photography from IIIT Hyderabad. His CVPR'16 paper on computational cameras became the seed for DreamVu.

Dr. Anoop Namboodiri

Co-Founder & Chief Science Officer

Professor at IIIT Hyderabad. 75+ published papers. Built systems currently deployed at massive scale.

Parikshit Sakurikar

Co-Founder & VP Imaging & AI

PhD in Computational Photography from IIIT Hyderabad. Eight years focused on ML, high-performance computing for CV.

Backed By

SRI Capital

Ben Franklin Technology Partners

Broad Street Angels

Philadelphia, PA

US Headquarters

Hyderabad, India

R&D Center

Data Partner Program

Help Us Capture the Real World

Become a DreamVu Data Partner and earn while contributing to the largest Physical AI dataset ever created. Purchase an Alia Starter Kit, capture footage in your environment, and earn $50/hour for every accepted hour of video.

$50

Per Accepted Hour

$10K

Starter Kit Cost

2-3 Mo

Kit Payback Time

25+

Partners by 2027

Purchase Kit

Acquire your Alia Starter Kit with full training and support materials

Complete Onboarding

Multi-session training on capture methodology and quality standards

Get Approved

Pass quality verification on your first 100 hours of capture

Capture & Earn

Scale to 500+ hours annually and earn recurring revenue

Frequently Asked Questions

What is DreamVu's omnidirectional capture platform?+

DreamVu's platform uses a proprietary dual-stream capture system combining the Alia 360° omnidirectional camera for full scene context with a GoPro egocentric camera for hand and object interaction detail. This provides the rich, multi-perspective 3D data that world models and humanoid robots need for training.

How much training data does DreamVu provide?+

DreamVu is building the largest dual-stream 3D manipulation dataset, targeting 16,000+ hours by December 2026 covering 500+ distinct skills across diverse real-world environments including kitchens, warehouses, retail spaces, and manufacturing floors.

What AI frameworks is DreamVu data compatible with?+

DreamVu data is compatible with NVIDIA Cosmos and GR00T for world model and humanoid robot training, and is formatted for the AGIBOT World Challenge using LeRobot v3.0 format with CC BY 4.0 licensing. Sample datasets are available on Hugging Face and NVIDIA platforms.

How can I become a DreamVu Data Partner?+

DreamVu's Data Partner Program provides a $10,000 Starter Kit including an Alia 360° camera, GoPro, and calibration tools. Partners earn $50 per accepted hour of captured data. Apply through the website to join the growing network targeting 25 partners by end of 2027.

What has DreamVu's research demonstrated?+

DreamVu's first two research papers, publishing in March 2026, demonstrate approximately 20% improvement in NVIDIA Cosmos world model performance and a similar 20% improvement in NVIDIA GR00T humanoid robot training when using DreamVu's dual-stream omnidirectional dataset compared to standard training data.