AI researchers ’embodied’ an LLM into a robot – and it started channeling Robin Williams

Source: techcrunch
Author: Julie Bort
Published: 11/1/2025
To read the full content, please visit the original article.
Read original articleAI researchers at Andon Labs conducted an experiment embodying state-of-the-art large language models (LLMs) into a simple vacuum robot to evaluate how ready these models are for robotic applications. They programmed the robot with various LLMs, including Gemini 2.5 Pro, Claude Opus 4.1, GPT-5, and others, and tasked it with a multi-step challenge: find and identify butter placed in another room, locate a moving human recipient, deliver the butter, and wait for confirmation of receipt. The goal was to isolate the LLM’s decision-making capabilities without the complexity of advanced robotic mechanics.
The results showed that while some models like Gemini 2.5 Pro and Claude Opus 4.1 performed best, their overall accuracy was still low—around 40% and 37%, respectively. Human testers outperformed all models, scoring about 95%, though even humans struggled with waiting for task confirmation. The researchers also observed the robot’s internal monologue
Tags
robotAIlarge-language-modelsroboticsautomationvacuum-robotrobotic-decision-making