Google is training its robots with Gemini AI so they can get better at navigation and completing tasks. The DeepMind robotics team explained in a new research paper how using Gemini 1.5 Pro’s long context window — which dictates how much information an AI model can process — allows users to more easily interact with its R2-T robots using natural language instructions.
This works by filming a video tour of a designated area, such as a home or office space, with researchers using Gemini 1.5 Pro to make the robot “watchâ€� the video to learn about the environment. The robot can then undertake commands based on what it has observed using verbal and / or image outputs — such as guiding users to a power outlet after being shown a phone and asked…