Google’s new Gemini Robotics On-Device AI model runs directly on robots: Watch it in action

Google DeepMind has introduced a game-changing innovation in robotics: Gemini Robotics On-Device. This advanced AI model runs entirely on the robot itself, without relying on cloud servers or internet connections. It brings faster performance, better security, and more flexibility to the world of robotics.
What Is Gemini Robotics On-Device?
Gemini Robotics On-Device is a lighter version of Google’s original Gemini Robotics AI model. Unlike its cloud-based predecessor, this version works completely offline. It processes language, visual data, and motor commands on the robot in real time.
Despite the smaller footprint, it performs almost as well as the full-scale version. Robots using this model can understand voice commands, interpret surroundings, and carry out complex tasks—on the spot.
Advanced Skills With Fewer Instructions
This AI model allows robots to complete dexterous tasks with accuracy. It can unzip a bag, fold a shirt, or assemble items on a conveyor belt. These aren’t pre-programmed movements—the robot understands what needs to be done based on language and visual cues.
Even more impressive is how little training it needs. Gemini Robotics On-Device can learn new tasks from just 50 to 100 demonstrations. That’s far less than older systems, which often needed hours or days of instruction.
Originally trained on the ALOHA robotic platform, the model has proven its flexibility. It now runs on different machines, such as Apptronik’s Apollo humanoid and the Franka Emika dual-arm FR3 system. This cross-compatibility shows how easily the model adapts to new hardware.
Why On-Device AI Is a Big Deal
Running AI locally brings several clear benefits. First, it reduces latency. There’s no need to wait for the cloud to process commands. The robot responds instantly, which is essential in time-sensitive situations like surgery or manufacturing.
Second, it improves security and privacy. All data stays on the robot. Sensitive environments—like hospitals, homes, or government facilities—can benefit from this kind of local processing. There’s less risk of data leaks or unauthorized access.
Third, the model works without internet access. This makes it perfect for use in remote areas, disaster zones, or even space missions.
How It Works in the Real World
In a recent demo, Google showed how well the model works. One robot opened a drawer, found a snack packet, and picked it up after hearing a voice command. Another folded laundry using both arms while reacting to real-time visual input.
These robots don’t just follow scripts. They observe, decide, and act—just like humans do. They make decisions based on what they see and hear, not just what they’re told.
Toward General-Purpose Robotics
The release of this model could reshape how we use robots. In the past, robots were often built for a single task. A factory arm might weld metal. A cleaning bot might vacuum. Each system had one job.
Gemini Robotics On-Device changes that. Robots can now perform many different tasks based on simple language instructions. You could ask one to fold clothes, pick up objects, or help assemble parts—all without reprogramming it for each job.
This could lead to the rise of general-purpose robots. These machines might assist in homes, offices, hospitals, or even construction sites. They could handle a wide range of tasks depending on the situation.
Tools for Developers
Google DeepMind isn’t keeping this model to itself. The company has launched a Gemini Robotics SDK (software development kit). Trusted developers can now fine-tune the model for their own robots and tasks.
This is the first time DeepMind has offered a Vision-Language-Action (VLA) model for outside use. It marks the start of a more open ecosystem where hardware makers, researchers, and startups can build smarter robots using Gemini as a base.
Challenges to Consider
While the model is powerful, it isn’t perfect. Local devices have limits in terms of computing power and memory. Complex environments may still cause errors or delays.
There are also ethical concerns. As robots get smarter, they could replace human jobs. Decision-making by machines also raises questions. Who is responsible if a robot makes a mistake? These are issues that developers and policymakers must address.
Looking Ahead
The Gemini Robotics On-Device model brings us closer to robots that think and act more like humans. It runs fast, learns quickly, and doesn’t depend on the cloud. That makes it more practical for real-world use.
With this step, Google is pushing robotics toward a future where machines help with daily life in real, meaningful ways. Robots may soon fold your laundry, carry groceries, or assist in emergencies—all powered by local AI.