The new Gemini Robotics 1.5 model will allow robots to "google" information on the Internet
Google DeepMind has unveiled updated artificial intelligence models Gemini Robotics 1.5 and Gemini Robotics‑ER 1.5, which allow robots to perform complex multi-step tasks and even access the Internet to search for information. According to DeepMind's head of robotics, Carolina Parada, the new system allows machines to "think several steps ahead" before acting.
While previous versions could only follow simple instructions like folding a sheet of paper or unzipping a zipper, now robots are able to sort laundry by color, pack a suitcase based on the weather forecast in a specific city, or separate trash into recycling, compost, and waste according to local rules found online.
Gemini Robotics‑ER 1.5 analyzes its environment and uses digital tools, including Google Search, to formulate a plan of action. It then transmits instructions to the Gemini Robotics 1.5 model, which uses vision and language understanding to complete tasks step by step.
Another innovation is the ability to "learn between robots." DeepMind has shown that skills learned by the two-armed ALOHA2 robot can work on the Franka robot or Apptronik's Apollo humanoid robot without additional setup. This means that the same set of models can control and transfer experience to different types of robots.
The update is already available to developers: Gemini Robotics‑ER 1.5 can be tested via the Gemini API in Google AI Studio, while Gemini Robotics 1.5 is currently only available to select partners.