How to make a good VLA?Multiple organizations propose new model RoboVLMs to unlock VLAs

RoboVLMs excel in simulation and real robot experiments by adding a motion prediction module to the visual language model; employing continuous motion space, multi-step history information, and a specialized history information organization module to improve model performance and generalization; and introducing cross-ontology data in the pre-training phase, which significantly enhances the model's robustness and performance on sample less tasks.

Search