Tokyo’s GPT-4 Robot Promises Humanlike Gestures

University of Tokyo's Alter3, now supercharged with the power of GPT-4, is the latest in large language model technology.

Step into the future with the University of Tokyo’s innovation with the creation of a humanoid robot named Alter3, now supercharged with the power of GPT-4, the latest in large language model technology.

 The University of Tokyo’s combination is changing the way robots understand and replicate human actions, making them more lifelike than ever before.

Imagine a robot that can capture the perfect selfie, toss a ball with precision, munch on popcorn, or even rock out an air guitar solo. These activities, once requiring intricate coding for each movement, are now effortlessly performed by Alter3, thanks to its integration with GPT-4. Unlike its predecessors, this AI-driven robot, learns from simple natural language instructions, eliminating the need for complex, hardware-specific controls.

Alter3, a marvel of engineering with its 43 axes mimicking human musculoskeletal movements, can execute intricate upper body maneuvers and even display detailed facial expressions. Although it’s stationary, resting on a base, its upper body can simulate walking and other dynamic motions.

The researchers, led by Takahide Yoshida, Atsushi Masumori, and Takashi Ikegami, have unlocked a new world of possibilities. By feeding verbal instructions to Alter3, they can now prompt GPT-4 to generate Python code that animates the robot’s Android engine. This process has transformed the tedious task of coding each joint movement into a simple, language-driven operation.

Alter3 not only performs these movements but also retains them in memory, allowing for continuous refinement. Over time, this leads to smoother, faster, and more accurate actions. The team even demonstrated this with specific instructions for taking a selfie – creating a big smile, adopting a dynamic pose, and simulating holding a phone.

But the innovation doesn’t stop there. Alter3, equipped with a camera, can observe and learn from human reactions, much like a newborn mimicking its surroundings. This capability, combined with the “zero-shot” learning potential of GPT-4, is pushing the boundaries of human-robot collaboration, making robots more intelligent, adaptable, and personable.

The team even injected a dose of humor into Alter3’s repertoire. In one scenario, the robot humorously reacts to mistakenly taking someone else’s popcorn, showcasing exaggerated expressions of surprise and embarrassment.


Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Tech sections to stay informed and up-to-date with our daily articles.