Warp-Speed Wednesdays

Your Must-Read Tech Updates: Tesla Optimus getting useful, AI reasoning in medicine, Robot costs dropping and investors are hungry for autonomy

Clinton Williams

May 08, 2024

Robotics: Building the Data Lego Blocks

Tesla Optimus Closer to Being Useful: I pick things up and put them down

Muscle Musclehead GIF - Muscle Musclehead Beefcake GIFs

Why is this important: The world of robotics is at a critical juncture. If humanoid robots are to avoid becoming another high-tech crash and burn—think Toyota Asimo and its "cool, expensive toy" reputation—they need to prove their practical value $$$. Tesla is acutely aware of this and is proactively deploying these robots to take over mundane tasks, targeting what they see as the low-hanging fruit in their factories.

Despite the advances in AI, robotics remains a challenging field. Tesla is applying a strategy similar to its Full Self-Driving (FSD) technology, using an end-to-end neural network that takes 2D visual and proprioceptive data and outputs robot controls. Starting from the ground up, Tesla is painstakingly building this dataset with basic tasks like simple pick-and-place operations. So even though this doesn’t seem impressive, you should expect dramatic improvements within a year.

Here is Milan Kovac, the Manager & Lead Engineer of the Optimus team, to summarize the updates.

DrEureka: NVIDIA and Dr Jim Fan continue to expand their philosophy of using LLMs to create autonomous digital and robotic agents

Why is this important: DrEureka, an innovative Large Language Model (LLM) agent that not only codes robot skills in simulation but also bridges the challenging simulation-reality gap, fully automating the transfer of newly learned skills to real-world applications.

DrEureka has enabled a robot dog to balance and walk on a yoga ball—across various terrains and even sideways—without any real-world fine-tuning. This task, particularly challenging due to the difficulty in accurately simulating a yoga ball's bouncy surface, was mastered through DrEureka's ability to adeptly navigate a vast array of sim-to-real configurations.

Traditionally laborious, the sim-to-real transfer often involves domain randomization and meticulous manual tuning of physical parameters such as friction and gravity. DrEureka leverages its built-in physical intuition to tune model parameters while explaining its rationale.

Google Deepmind ALOHA Unleashed & Open Source Bipedal Hector: Capable robots are rapidly dropping in costs with AI

Why is this important: The rapid drop in robotics hardware, open-sourced robot-specific data, and deep learning algorithms is accelerating robotics research for practical application.

ALOHA is a $30-40k purchasable dual-arm robotic (mobile and stationary) platform with integrated cameras, soft grippers, and teleoperating joysticks. An industry-level platform would cost $200k, so this 80% reduction eases the financial burden for researchers and hobbyists to work on new algorithms and more importantly, gather robot-specific data.

Meet Hector from Laser Robotics which is opening up the world of humanoid robotics research.

Hector combines open-source software with available-for-purchase hardware, complete with its own control algorithms. Like the dramatic cost reductions seen in quadruped robots over the last five years, research-grade bipedal robots are becoming widely accessible and affordable. This shift is allowing more innovators and researchers to continue to push the open-source robotics field forward.

But how do these platforms help with advancing the robot industry? I will explain in a more technical post coming soon.

HuggingFace releases LeRobot: Démocratiser Les Robots

Provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry to robotics.
Contains state-of-the-art approaches that have been shown to transfer to the real-world with a focus on imitation learning and reinforcement learning.
Pretrained models, datasets with human-collected demonstrations, and simulation environments to get started without assembling a robot.

Industry Finding Initial Use Cases for Humanoid Robots: Big Tech partnering with Robotics companies

Why is this important: Autonomy is coming! Silicon Valley investors have seen enough to feel confident to invest billions of dollars into autonomy. They are betting that it is coming soon and with a hefty ROI.

Stay tuned to my next Hitchhiker’s Handbook chapter for a deep dive.

AI outsiders in Big Tech are taking larger positions to catch up to other leaders in AI (Tesla, Microsoft, NVIDIA, and Google).

Microsoft Research working with Sanctuary AI on humanoid robotics.
Agility Robotics launches its humanoid with Zion Solutions Group, a supply chain company.
NVIDIA, Microsoft, and Softbank contribute to autonomous driving software startup Wayve Series C funding.
Apple scurrying for a way into car autonomy with a potential Rivian partnership after canceling its internal car division. Apple made focused bets on the metaverse with VisionPro, which has abysmal sales and needs to pivot to AI quickly as Meta and Zuckerberg have.

AI: Reasoning is What You Need

Med-Gemini: Google pushes the reliability of AI in the medical industry using reasoning

Med-Gemini demostrates exceptional performance in a variety of medical benchmarks, outperforming previous standards and even GPT-4 in some cases.

Key Highlights:

Understanding and generating medical texts
Handling long-context data
Operating in multimodal environments involving text, visuals, and real-world medical data.

Med-Gemini excels in real-world medical applications by integrating deep learning models with large-scale medical data sets to support complex decision-making processes. For instance, it has significantly improved the accuracy of medical diagnostics and patient care by synthesizing vast amounts of medical history and imaging data. Moreover, it provides actionable insights by utilizing its state-of-the-art reasoning and search capabilities, which help in navigating through extensive electronic health records (EHR) and complex medical literature.

Agent Hospital: Simulated Evolving Medical Agents

The Sims 4 Get to Work! - Doctor! - YouTube

Research is finding creative ways to push the reasoning capabilities of LLMs in specific domains.

These researchers introduce a simulacrum called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). This is similar to a previous study on simulating human interaction to study AI agents.

The overview of the MedAgent-Zero method. This diagram illustrates the method by which AI “doctor” agents achieve self-evolution.

Results: “After treating around ten thousand patients (real-world doctors may take over two years), the evolved doctor agent achieves a state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset that covers major respiratory diseases.”

Simulation does not replace reality, so this research must be taken with caution, but I expect these specialized agents will be in every field collaborating to make us better at our jobs.

OpenCRISPR: World’s first successful editing of the human genome where every component is fully designed by AI

OpenCRISPR trained large language models (LLMs) on an extensive dataset of diverse CRISPR-based gene editing systems.
The proteins generated by these models increased the diversity of virtually all naturally occurring CRISPR-Cas families by 4.8-fold.
They can rapidly continue to increase this diversity at will.
The focus was placed on CRISPR-Cas9 systems due to their widespread adoption, Nobel Prize recognition, and recent FDA approval as a novel therapeutic modality.
In human cells, OpenCRISPR's computationally designed gene editors demonstrated activity and specificity comparable to or better than SpCas9, a standard gene editor.
The new gene editors were more than 400 mutations distant from SpCas9, highlighting their novel design.

Engineer's Guide to the Tech Galaxy

Discussion about this post