针对现实中机器人训练成本高、硬件易损的难题,Miles 和 Nia 揭秘了 DayDreamer 如何让机器人在“脑内实验室”里闭关修炼。通过世界模型,机器人仅需一小时就能从笨拙摔倒进化到稳健行走,开启了高效学习的新范式。

机器人不再让机器人在现实里死磕,而是让机器人在脑子里给自己造了一个‘梦境实验室’。它在现实里只需要少量的尝试,就能在脑子里推演几万次。
DayDreamer: World Models for Physical Robot Learning 给我讲讲这篇论文。


世界模型是机器人的一种“梦境实验室”,它允许机器人在大脑中模拟现实环境。通过将摄像头画面和关节感应数据压缩成紧凑的“离散代码”,机器人可以建立一个循环状态空间模型(RSSM)。这个模型能够预测不同动作产生的后果,使机器人无需在现实中真实操作,就能在脑海里推演成千上万次,从而极大地降低了现实世界中的试错成本。
这得益于其独特的“解耦”设计和 Actor-Critic 机制。机器人拥有两个并行的线程:一个“执行者”在现实中进行少量尝试并收集数据,另一个“学习者”则利用这些数据不断修补和优化“梦境”。由于“演员”网络是在 GPU 驱动的虚拟轨迹中进行高强度的排练和进化,它能在极短的时间内掌握翻身、站立和行走等复杂动作。
该框架具有极强的自适应和在线学习能力。在机械臂实验中,即使遇到日出导致的光影剧烈变化,机器人虽然初期性能会下降,但它能通过不断更新自己的世界模型,在约 5 小时内适应新环境,性能甚至能超过之前的水平。这种持续进化的能力让机器人能够应对现实世界中不可预见的干扰。
它标志着机器人学习范式的转变,从依赖死板的预写代码或高误差的电脑模拟器,转向让机器人在现实中“边玩边学”。DayDreamer 证明了同一个算法框架可以适配四足机器人、机械臂和球形机器人等多种形态。随着该基础设施的开源,机器人学习的门槛大大降低,开发者可以更专注于提供学习框架,让机器人通过自身的“想象力”去理解和适应复杂的现实世界。
Criado por ex-alunos da Universidade de Columbia em San Francisco
"Instead of endless scrolling, I just hit play on BeFreed. It saves me so much time."
"I never knew where to start with nonfiction—BeFreed’s book lists turned into podcasts gave me a clear path."
"Perfect balance between learning and entertainment. Finished ‘Thinking, Fast and Slow’ on my commute this week."
"Crazy how much I learned while walking the dog. BeFreed = small habits → big gains."
"Reading used to feel like a chore. Now it’s just part of my lifestyle."
"Feels effortless compared to reading. I’ve finished 6 books this month already."
"BeFreed turned my guilty doomscrolling into something that feels productive and inspiring."
"BeFreed turned my commute into learning time. 20-min podcasts are perfect for finishing books I never had time for."
"BeFreed replaced my podcast queue. Imagine Spotify for books — that’s it. 🙌"
"It is great for me to learn something from the book without reading it."
"The themed book list podcasts help me connect ideas across authors—like a guided audio journey."
"Makes me feel smarter every time before going to work"
Criado por ex-alunos da Universidade de Columbia em San Francisco
