How Robots Turn Language into Motion: The AI Stack Behind Physical AI

May 28
43 mins

Episode Description

How do robots go from human instruction to real movement?

Telling a robot to “pick up a box” sounds simple. But behind that command is a complex chain of decisions: understanding language, interpreting the environment, choosing the right action and turning it into physical movement.

In this episode, Clemens (Principal Engineer) and Robert (Robotics Engineer & Researcher) explain how RobCo approaches this challenge with ALFIE - combining classical robotics, AI models, sensors, safety systems and real-world industrial requirements.

You'll gain insights into:

  • the three-layer hierarchy (System 2 / System 1 / System 0) that turns language into motor currents
  • why physical grounding is the hardest unsolved problem in robotics today
  • how 100-200 demonstrations are enough to fine-tune Alfie on a new use case
  • why methods that brought man to the moon are now central to physical AI

More about RobCo: Website:https://www.rob.co LinkedIn: https://www.linkedin.com/company/robco-therobotcompany/ Instagram: https://www.instagram.com/robco_therobotcompany/

Chapter markers 00:00 Controlling robots with language 00:32 Meet Clemens and Robert 02:22 System 2, 1, 0: How robots think 04:35 The driving analogy explained 06:28 What's the hardest part of the chain? 07:15 Translating language into robot action 08:43 What really happens when you say "pick up the glass" 11:04 Why neural nets find their own language 15:21 Introducing Alfie 21:09 Pre-training + fine-tuning a robot 24:49 How commands become motor currents 28:31 Top 3 questions from Hannover Messe 35:04 The funniest moment at the trade fair 38:02 What makes Alfie different 40:28 World models: The next big unlock?

See all episodes