Robotic Brains
Abstract
In silico control systems for edge case active learning
Description
The fascination with robotics has been a fixture of human imagination since the dawn of science fiction and occupies equal weight in utopian and dystopian narratives. But now, thanks to breakthroughs in generative AI and cognitive/spatial-awareness language models, a renewed optimism has taken hold—one that suggests we might be on the cusp of realizing robotics’ full potential. This optimism has led many, myself included, to re-examine the segment with cautious excitement.
One could also argue that the market needs a sufficient level of infrastructure abstraction to properly enable the long tail of applications. Some use cases may not have the economics to support a full-stack build, but could thrive with an underlying foundational model that can be fine-tuned for the last mile. Much of the most groundbreaking research in recent years has been at this cerebral level—enhancing capabilities across perception, navigation, and manipulation.
Intuitively, this is the more attractive bet to make as a VC, given the distribution and cost advantages of software. It’s also the most obvious play, with VC giants and corporate players already putting massive cash to work. Like LLMs, if the required entry capital for this play is in the tens or hundreds of millions, then there’s not really an option for smaller, pure-play (pre)seed funds.
There are still several open questions live in my mind for this strategy:
Novel data acquisition strategies
Compute cluster requirements and existing hardware limitations
Pricing structures across inference
Local compute and batch-based transfer learning
Perception and navigation calibration across hardware
Working Thoughts
There could be an ecosystem / community / marketplace of models approach that usurps any standalone general purpose model. I personally struggle with this one, at least in the near term. Despite all the hype in robotics there’s just drastically less talent in the world right now that could contribute to said community.
Then again HuggingFace was founded in 2016 when machine learning talent was a fraction of what it is today. I would think the most viable wedge for this could be either a reimagination of ROS or a modular architecture for robotics-specific model chaining. At the pre-seed stage this approach is uniquely dependent on a founding team with an exceptional network or community cultivation skills.
Relevant Companies
Operating systems - Viam, Opteran, Formant, Automatika,
Reimagine Robotics (portfolio)
Related Reading
No, to the Right - adapting to natural language corrections
GOAT - universal navigation system
SpatialVLM - vision-language models with spatial reasoning capabilities
PaLM-E - embodied multimodal language model
Genie - generative interactive environments
SIMA - AI agent for 3D virtual environments