Robotic Brains

Abstract

In silico control systems for edge case active learning

Description

The fascination with robotics has been a fixture of human imagination since the dawn of science fiction and occupies equal weight in utopian and dystopian narratives. But now, thanks to breakthroughs in generative AI and cognitive/spatial-awareness language models, a renewed optimism has taken hold—one that suggests we might be on the cusp of realizing robotics’ full potential. This optimism has led many, myself included, to re-examine the segment with cautious excitement.

One could also argue that the market needs a sufficient level of infrastructure abstraction to properly enable the long tail of applications. Some use cases may not have the economics to support a full-stack build, but could thrive with an underlying foundational model that can be fine-tuned for the last mile. Much of the most groundbreaking research in recent years has been at this cerebral level—enhancing capabilities across perception, navigation, and manipulation.

Intuitively, this is the more attractive bet to make as a VC, given the distribution and cost advantages of software. It’s also the most obvious play, with VC giants and corporate players already putting massive cash to work. Like LLMs, if the required entry capital for this play is in the tens or hundreds of millions, then there’s not really an option for smaller, pure-play (pre)seed funds.

There are still several open questions live in my mind for this strategy:

  • Novel data acquisition strategies

  • Compute cluster requirements and existing hardware limitations

  • Pricing structures across inference

  • Local compute and batch-based transfer learning

  • Perception and navigation calibration across hardware

Working Thoughts

  • There could be an ecosystem / community / marketplace of models approach that usurps any standalone general purpose model. I personally struggle with this one, at least in the near term. Despite all the hype in robotics there’s just drastically less talent in the world right now that could contribute to said community.

  • Then again HuggingFace was founded in 2016 when machine learning talent was a fraction of what it is today. I would think the most viable wedge for this could be either a reimagination of ROS or a modular architecture for robotics-specific model chaining. At the pre-seed stage this approach is uniquely dependent on a founding team with an exceptional network or community cultivation skills.

Relevant Companies

Related Reading

  • No, to the Right - adapting to natural language corrections

  • GOAT - universal navigation system

  • SpatialVLM - vision-language models with spatial reasoning capabilities

  • PaLM-E - embodied multimodal language model

  • Genie - generative interactive environments

  • SIMA - AI agent for 3D virtual environments

Previous
Previous

Full Stack Robotics