Imagining the "Monty moment"

Note: I am not referring to the “Monty moment” as having anything to do with language, iPhone or GPT-3, those references are only used to give an idea of what I mean specifically about the “moment” part, which is the version that is going to be a milestone which makes general acceptance plausible, it doesn’t even have to be guaranteed.

I do not expect that anyone considered a successful 1SM+1LM-experiment to be a “Monty Moment”, even though it is an important milestone. There are obviously other milestones up until then.

That now being stated, the following is extracted from the general chat:

Timothy wrote:
I am trying to imagine the “Monty moment”. Which I think of as the first Monty configuration that crosses from research prototype to a widely recognized system (analogous to GPT-3 for LLMs).

Has anyone suggested an idea of what that configuration looks like, in terms of sensory and learning modules? (How many? Which types?)

Clarifying edit which was not in the chat: For example, when I imagine it, I do so anthropomorphically (2 visual SMs “eyes”, 1 olifactory SM “nose”, 2 auditory SMs “ears”, etc…), which is obviously not the best here.

And does someone have a vision of when (what year) that could happen?

AgentRev wrote:
I also myself wrote a broader, more speculative thread on the overall module heterarchy, although it needs refining as pointed out by Dr. Clay’s reply, I will revisit it in the medium term as I am quite interested in the matter too: The road to a Generally-Intelligent Monty?

tslominski wrote:
This is all wild speculation on my part. One constraint I myself put on the “Monty moment” is that it won’t be a communicating-with-humans moment. Because of the sensorimotor nature of the intelligence, and my belief that language will come last, my (wild) guess is that it’ll be something more like a domestication of an animal moment. I wasn’t arround for any of those, so not sure how those play out :slightly_smiling_face:.

to be clear… in this analogy… the “animal” will later on learn to talk which will lead to other “Monty moments”.

AgentRev
I’m not sure if Monty will have an “iPhone moment” like ChatGPT did. People already have PhD-level, voice-capable LLMs at their fingertips, and everyone has seen Boston Dynamics robots and the likes doing parkour and backflips. Even if a Monty robot can eventually perform some intricate sensorimotor task, it’s not that likely to wow the crowds. Nowadays, when people see robots dance or do karate, they simply think it’s pre-recorded movements (which is often partially true too, mind you not).

The majority of people probably won’t notice if and when Thousand Brains starts surpassing deep-learned AIs. After all, a lot of people think that ChatGPT is just a database of pre-written answers. At this point, a wow moment would require something truly remarkable, like figure skating, crushing AI benchmarks, racking up gold medals at the Humanoid Robot Games, piloting an aerobatic aircraft, or maybe Monty going above and beyond to rescue a person in critical danger without prior instruction.

But, is a wow moment even desirable? Making the headlines would bring a tsunami of attention, potentially strong enough to disrupt the team and flood the community with noise. Worse, one can imagine Big Tech trying to poach team members to lead private Thousand Brains efforts… This isn’t the way. Organic, diligent growth is a more sustainable path, with greater resilience to potential industry downturns that may loom ahead. Let the deep AIs bask in the spotlight, while steadily crafting the successor that will gradually replace them. :mechanical_arm:

survey how llms work hpd0i503x09g1.jpeg

vclay wrote:

Hi @TimothyAlexisVass That’s a really good question and we have been thinking about this for quite a while now too. I wouldn’t say we have a great answer to what our “ChatGPT moment” would be or if there will even be one in the same sense. ChatGPT was able to rapidly get adopted by so many people because you just need a browser and wifi connection to be able to use it. A sensorimotor application of Monty might require more accessible robotics hardware or digital use cases that don’t exist today. Here is an excerpt from some of Jeffs thoughts on this a few weeks ago: https://youtu.be/x-r84LkT8L0?si=cf7GwzObuJ9pJJuf&t=3149

Ultimately, I think it is almost impossible to predict (as far as I know, ChatGPT also was more of a research demo without expecting such a fast takeoff).
We have been thinking through various possible demos over the past year and it is hard to know which one would the “the one”. Some seem exciting to us from a research perspective (like solving Omniglot, the ARC-AGI challenge, or cracking the newest CAPTCHAs) but they may not have broad appeal outside a research community. Others might have big commercial implications, but aren’t problems most people even know exist. Others might have broader appeal (like a robot doing your dishes and quickly adapting to any kitchen and cutlery) but may rely on novel hardware being developed that is outside our projects realm.

For now, our focus is on developing the core algorithm that will underlie all these potential applications. If you watch the earlier part of the video I sent (starting ~31 minutes) you can see how internally we don’t focus on building demos or applications. But we want to encourage others to do so. Experts in various domains are probably the best at using Monty for solutions and we hope that the collective creativity of people in the community here will lead to many ideas we ourselves wouldn’t have ever come up with.

I don’t think we can predict what will eventually make Monty take off or which applications it will have the biggest impact on in in the future (or completely new ones that will appear). For now we do our best on building a general purpose sensorimotor AI platform and I am excited to watch where people might take that in the next years :slight_smile:

2 Likes