Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation

  • 2025-03-31 17:39:38
  • Abhiram Maddukuri, Zhenyu Jiang, Lawrence Yunliang Chen, Soroush Nasiriany, Yuqi Xie, Yu Fang, Wenqi Huang, Zu Wang, Zhenjia Xu, Nikita Chernyadev, Scott Reed, Ken Goldberg, Ajay Mandlekar, Linxi Fan, Yuke Zhu
  • 0

Abstract

Large real-world robot datasets hold great potential to train generalistrobot models, but scaling real-world human data collection is time-consumingand resource-intensive. Simulation has great potential in supplementinglarge-scale data, especially with recent advances in generative AI andautomated data generation tools that enable scalable creation of robot behaviordatasets. However, training a policy solely in simulation and transferring itto the real world often demands substantial human effort to bridge the realitygap. A compelling alternative is to co-train the policy on a mixture ofsimulation and real-world datasets. Preliminary studies have recently shownthis strategy to substantially improve the performance of a policy over onetrained on a limited amount of real-world data. Nonetheless, the communitylacks a systematic understanding of sim-and-real co-training and what it takesto reap the benefits of simulation data for real-robot learning. This workpresents a simple yet effective recipe for utilizing simulation data to solvevision-based robotic manipulation tasks. We derive this recipe fromcomprehensive experiments that validate the co-training strategy on varioussimulation and real-world datasets. Using two domains--a robot arm and ahumanoid--across diverse tasks, we demonstrate that simulation data can enhancereal-world task performance by an average of 38%, even with notable differencesbetween the simulation and real-world data. Videos and additional results canbe found at https://co-training.github.io/