If physical AI is going to match the accomplishments of LLMs, there's a data problem that needs to be solved.
AI labs are increasingly outsourcing the laborious task of collecting real-world data for training physical AI systems to specialized companies like XDOF. This addresses a critical bottleneck in robotics development, where the sheer volume and variety of physical interactions required for robust AI far exceed what can be simulated or collected by internal teams.
This shift matters because it directly impacts the pace and scale at which embodied AI can advance beyond current laboratory demonstrations. Companies like Boston Dynamics and Figure AI, striving to create more adaptable robots, face significant data acquisition hurdles. By externalizing this "dirty work," they can potentially accelerate the development of robots capable of performing complex tasks in unstructured environments, moving closer to the widespread deployment seen with LLMs.
Future developments will reveal whether this outsourcing model can maintain data quality and proprietary advantage. The key question is whether specialized data collection firms can consistently provide the nuanced, context-rich datasets needed to bridge the gap between simulated and real-world robotic performance, or if in-house data generation will remain the gold standard for leading robotics labs.