Add to Favourites
To login click here

Google DeepMind has unveiled RT-2, a vision-language-action (VLA) model trained on Internet-scale data that can be integrated into end-to-end robotic control. This model enables robots to learn how to do tasks from the Internet, generalise what they’ve learned to new tasks, accomplish these tasks, and even engage in rudimentary reasoning. This is one of the first times that large language models (LLMs) have been directly linked to the physical world with minimal human intervention, creating the possibility of robots becoming more adaptable and independent.