Robots - The Next Big Thing?
This technology is becoming much more accessible than one might have expected. At least partially.
Disney Research engineers make no secret of their excitement as they create droids from the Star Wars universe. Now, thanks to Antoine Pirrone, anyone can build a mini-droid for about $400. And while it looks like a fun toy, robots like these have every chance of stepping beyond the boundaries of theme parks and entering everyday life.
It seems robotics has experienced that very leap from quantity to quality that was hammered into us in philosophy class. Machines now effortlessly master skills that previously had to be rigidly coded.
Researchers from Columbia Engineering recently demonstrated this clearly: their robots independently study the structure of their own bodies and principles of movement simply by observing themselves through a camera. They literally look in a mirror and come to know themselves!
The company Figure unveiled Helix — a universal machine learning model of the Vision-Language-Action (VLA) type for humanoid robots. This system simultaneously processes images and commands in ordinary human language, then controls robots in real time. Thanks to Helix, robots can recognize and work with thousands of common household items. Additionally, Figure announced the creation of BotQ — a robot manufacturing plant. Initially, it will produce up to 12,000 humanoids annually.
Google DeepMind recently announced two new models: Gemini Robotics with Vision-Language-Action (VLA) technology and Gemini Robotics-ER with Embodied Reasoning technology. Both operate on the Google Gemini platform — a multimodal foundation model that understands text, voice, and images, answers questions, and provides recommendations.
DeepMind described Gemini Robotics as an "advanced vision-language-action system." It perceives the same information as the basic Gemini but can transform it into commands for a robot's physical actions. Importantly, it's compatible with any hardware. The project website features impressive demonstrations of the system in action. Interested in learning more? Check out the article “Implementing AI in the Physical World”.
By the way, Google isn't the only one developing robotics. Apple is showing interest too — the company revealed an adorable and helpful robotic lamp, and rumors suggest they're working on creating androids.
Now, with NVIDIA having unveiled a foundation model for controlling robots (demonstrated on that same Disney droid) and immediately publishing it on GitHub and Hugging Face, this technology is becoming much more accessible than one might have expected. At least partially.
Of course, building a humanoid robot in your garage is still unrealistic, but the situation with robotic manipulators is different. There are already plenty of open projects: PAROL6, toolboxrobotics, arctosrobotics, Thor — and that's just the tip of the iceberg. With software development progressing rapidly, the idea of assembling such a robo-arm becomes increasingly tempting. Personally, I plan to teach mine to cook as well as the robo-chef Zippy.

