Human Archive Is Capturing Human Motion

to Power the Robots of Tomorrow

Human Archive Is Capturing Human Motion to Power the Robots of Tomorrow

Before a robot can clean a kitchen, fold laundry, or assemble a product on a factory floor, it needs to watch a human do it first. Millions of times.

This is the fundamental data problem at the heart of physical AI. Language models can be trained on text scraped from the internet. Vision models can learn from billions of public images. But robotic manipulation data, the kind that teaches a machine how human hands grip, lift, pour, and assemble, cannot be scraped from anywhere. It has to be captured in the real world, from real people doing real work.

That is the business Human Archive is building. And it is building it in India.

What Human Archive Actually Does

Founded by four twenty-year-olds, Rushil Agarwal, Raj Patel, Samay Maini, and Shloke Patel, all UC Berkeley and Stanford dropouts, Human Archive is a Y Combinator-backed startup with offices in San Francisco and Bengaluru. Its mission is to build the largest human sensorimotor dataset ever assembled, and sell it to the frontier labs and robotics companies racing to build physical AI systems.

The company recently raised $8.2 Mn in a round led by Wing Venture Capital, an early Snowflake backer, and NVP Capital. The cap table includes angel investors from OpenAI, NVIDIA, Google, Meta, Anduril, DoorDash AI Research, and several other frontier technology organisations.

The data collection process works like this. Workers are supplied with hardware rigs that include downward-facing cameras recording 4K video at 30 frames per second, depth-sensing cameras, a wide-angle lens, tactile gloves, wrist-mounted cameras, and arm and chest-mounted inertial measurement units. The rig is designed to capture precisely how human hands perform specific tasks: the grip, the pressure, the motion, the sequence.

Once recorded, the footage is processed with motion capture technology and tactile force-feedback streams, then run through proprietary quality assurance, hand-tracking, and reconstruction models before being packaged as training data for robotics companies.

Human Archive has so far collected tens of thousands of hours of data. It wants millions.

India as the Data Engine

Of the 125 or more companies Human Archive has partnered with, a significant portion are based in India. The company has signed agreements across hotels, restaurants, quick commerce platforms, construction sites, and factories, though many of these partnerships have not yet been activated.

The India focus is not accidental. India's large, organised blue-collar workforce, working across home services, logistics, manufacturing, and hospitality, represents an enormous and largely untapped source of the kind of task-specific physical data that physical AI systems need to learn from.

The company's work came into public focus this week following a controversy around home services startups using worker tracking data to train AI models. Gurugram-based home cleaning service Pronto was found to have run a pilot using such data. Rival Snabbit was found to have conducted a similar test in a controlled environment in partnership with Human Archive, though Snabbit has stated it has not deployed this in real customer scenarios.

"Understanding something and deploying it in our customers' homes are two very different things," Snabbit said in its statement.

The Ministry of Electronics and Information Technology is reportedly taking notice of the controversy, which could prompt greater scrutiny of how startups collect and use data from workers in customer environments.

The Ethical Fault Line

The controversy around Human Archive sits at the intersection of two uncomfortable realities.

The first is consent. Human Archive's founders maintain that anyone engaged directly by the company is fully informed about what is being captured and why. But when data collection runs through a partner business, the responsibility for informing workers shifts to the partner.

When asked whether workers understand that they could potentially be training themselves out of a job, Agarwal said the company is transparent with partners about the intent and purpose of data usage, but that it is the partner companies' responsibility to educate their employees.

The second reality is structural. Blue-collar workers in India earn average monthly salaries of Rs 15,000 to Rs 35,000. They form the backbone of the industries that physical AI systems are being designed to automate. The data being collected from their hands and movements is being used to build systems that could, over time, reduce demand for their labour.

This dynamic has drawn comparisons to the content-moderation sweatshops of the late 2010s: Global South labour feeding a model development ecosystem that is almost entirely Western in ownership and benefit.

Physical AI data collection has begun attracting similar criticism, and it is unlikely to fade as the industry scales.

The Race to Own Physical AI Data

Human Archive is not operating in a vacuum. Data collection has quietly become its own category inside the physical AI race.

Scale AI, in which Meta owns a 49% stake, now runs a dedicated Data Engine for Physical AI and has completed over 100,000 production hours at its San Francisco prototyping facility. Build AI sells an egocentric dataset of first-person video aimed specifically at industrial robots and embodied AI systems.

This week, entrepreneur Abhinav Kukreja launched Neocambrian AI, describing it as the data foundation of physical AI, built around high-fidelity pre-training scale data of human action from India. The pitch is nearly identical to Human Archive's.

Given the depth of India's blue-collar workforce and the scale of data that physical AI systems will require, this is almost certainly not the last such company to emerge from the Indian market.

Vyapaarवाणी Takeaway : India's Next Export May Not Be Software. It May Be Human Motion.

For decades, India's technology economy has been built on exporting software talent and services. The physical AI era is opening a new and more complicated chapter.

India's blue-collar workforce is large, skilled in physical tasks, and increasingly being looked at as the training ground for the robots and autonomous systems that will define the next wave of automation. The data being collected from their hands today could shape global robotics development for the next decade.

The opportunity is real. So is the responsibility.

For founders, investors, and policymakers watching this space, Human Archive's story raises questions that go well beyond business models and funding rounds. Who owns the data collected from a worker's hands? Who benefits from it? And what obligations do the companies capturing it have toward the people whose labour makes it possible?

These questions do not have easy answers. But as physical AI scales, they will become impossible to avoid.

Stay tuned for more stories on India's most ambitious builders in Vyapaar वाणी!

Connect With Us

For Any Inquiries Or Assistance, Please Feel Free To Reach Out. Our Team Is Here To Support You And Will Respond At The Earliest Convenience