Skip to main content

Google DeepMind releases Gemini Robotics On-Device Model

Created on June 24|Last edited on June 24
Google DeepMind has released Gemini Robotics On-Device, a new variant of its advanced vision-language-action model that can operate entirely on local robotic hardware. This launch builds on the earlier Gemini Robotics announcement in March 2025, and brings a focus on making robots more responsive and adaptable by removing their dependence on network connectivity. Designed to run on-device with high efficiency, the new model offers strong general-purpose manipulation skills and the ability to adapt to new instructions and environments with minimal additional training.


Efficiency and Real-World Autonomy

Unlike many robotics systems that rely heavily on cloud-based inference, Gemini Robotics On-Device is optimized to run directly on robotic devices themselves. This shift significantly reduces latency, making the model better suited for tasks in environments where internet access is limited or unavailable. Operating independently from a central server also allows for greater robustness in real-world deployments, including homes, warehouses, and remote industrial sites. The local operation opens new possibilities for robotics in edge computing environments where real-time decision-making is critical.

Capabilities and Performance in Dexterous Tasks

The model is designed for bi-arm robots and has been evaluated on a broad set of challenging tasks involving visual reasoning, language instruction, and physical manipulation. It successfully performs complex, high-precision actions like folding garments, unzipping bags, or assembling components—tasks that require nuanced, coordinated control. Gemini Robotics On-Device not only generalizes across tasks but also maintains performance on previously unseen scenarios and multi-step instructions. Benchmarks show it outperforms previous best-in-class on-device models and even approaches the capabilities of the larger, cloud-based Gemini Robotics system in some cases.

Fine-Tuning and Adaptation to New Robots

One of the key innovations is the model’s adaptability. Developers can fine-tune Gemini Robotics On-Device with as few as 50 to 100 task demonstrations, accelerating customization for specific environments or new robotic forms. While the base model is trained for ALOHA robots, it has been effectively adapted to both the Franka FR3 bi-arm platform and the Apollo humanoid robot from Apptronik. This flexibility demonstrates the model’s strong generalization across both tasks and hardware embodiments, a step forward in modular AI deployment.

Broader Impact and Developer Tools

With the release of the Gemini Robotics SDK, developers can now test and adapt the model using the MuJoCo physics simulator or on real hardware. By supporting rapid prototyping and fine-tuning, the SDK aims to make advanced robotics more accessible to researchers, startups, and developers across domains. The trusted tester program provides early access for evaluation and feedback, signaling a staged release that prioritizes stability and real-world utility.
Gemini Robotics On-Device reflects DeepMind’s ongoing push to bring generalist AI systems into physical contexts, bridging digital intelligence and embodied control. This release strengthens the trend toward localized AI models that can operate independently, respond in real-time, and adapt flexibly—without relying on a constant connection to the cloud.
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.