Last year, Hugging Face, the AI dev platform, launched LeRobot, a collection of open AI models, data sets, and tools to help build real-world robotics systems. On Tuesday, Hugging Face teamed up with AI startup Yaak to expand LeRobot with a training set for robots and cars that can navigate environments, like city streets, autonomously.
The new set, called Learning to Drive (L2D), is over a petabyte in size, and contains data from sensors that were installed on cars in German driving schools. L2D captures camera, GPS, and “vehicle dynamics” data from driving instructors and students navigating streets with construction zones, intersections, highways, and more.
There’s a number of open self-driving training sets out there from companies including Alphabet’s Waymo and Comma AI. But many of these focus on planning tasks like object detection and tracking, which require high-quality annotations, according to L2D’s creators — making them difficult to scale.

In contrast, L2D is designed to support the development of “end-to-end” learning, its creators claim, which helps predict actions (e.g. when a pedestrian might cross the street) directly from sensor inputs (e.g. camera footage)
“The AI community can now build end-to-end self-driving models,” Yaak co-founder Harsimrat Sandhawalia and Remi Cadene, a member of the AI for robotics team at Hugging Face, wrote in a blog post. “L2D aims to be the largest open-source self-driving data set that empowers the AI community with unique and diverse ‘episodes’ for training end-to-end spatial intelligence.”
Hugging Face and Yaak plan to conduct real-world “closed-loop” testing of models trained using L2D and LeRobot this summer, deployed on a vehicle with a safety driver. The companies are calling on the AI community to submit models and tasks they’d like the models to be evaluated on, like navigating roundabouts and parking spaces.