Note: This post is a bit dense and speculative, it’s more of an exploratory topic than a formal proposal!
I wanted to follow up on the question I asked during the meetup Q&A and flesh out the details more.
Many users have been talking about multiple aspects of what a full-scale Thousand Brains system would entail, especially in terms of software stack and associative connections. One thing that’s been simmering in my mind since I discovered TBP is what would a complete system look like at the big-picture, architectural level, to achieve all project goals.
The team hasn’t deeply explored that angle, there are some hints in Long-Term Goals and Principles, Capabilities of the System, and the TBP Future Applications video, but those focus more on the “what” (goals) than the “how” (components). So, I took some time to try imagining what TBP might end up evolving to in the long term.
The interactions between the human brain regions are very complex, but fortunately, the Sensor and Learning Modules paradigm that the team adopted makes it easier to break it down into tractable components.
For instance, Monty’s current modules are an abstract, partial implementation of the following pathways, roughly speaking:
- Visual Sensor Modules = Retina → lateral geniculate nucleus (LGN) → V1 → V2 → V4;
- Object Learning Modules = Inferior temporal gyrus (IT) / lateral occipital cortex (LOC) → perirhinal cortex (PRC);
- Action Spaces, Policies, Goal States = rudimentary version of V5/MT → posterior parietal cortex (PPC) → posteromedial cortex (PMC), in addition to basal ganglia.
From my understanding, the team is also doing early work on the following modules:
- 2D Vision Sensor and Learning Modules for shape and texture detection, learning to read from scratch, and decreasing reliance on depth data [V1, V2, V4, visual word form area]
- Touch Sensor and Learning Modules for prehensile capabilities [parietal lobe];
- Motor Modules for physical movement and proprioception [PMC, M1, cerebellum].
For TBP to achieve most of its goals and perhaps reach general intelligence (which the team hinted at many times), I suspect more modules would be required for full “bootstrapping”, as I mentioned in the Q&A. Applying the TBP paradigm to the entire cortex, its other main features might also have to be implemented as modules, such as (but not limited to):
- Visual Motion Sensor and Learning Modules for live change detection and object behavior modeling [V3, V5/MT];
- Audio Sensor and Learning Modules to learn spoken language from scratch [auditory cortex, Broca’s and Wernicke’s areas, temporal gyri];
- Scene Learning Module for simultaneous localization and mapping (SLAM) [parahippocampal place area, retrosplenial & entorhinal cortexes, place cells];
- Social Module for affinity to human social cues and alignment [fusiform face area, extrastriate body area];
- Attention Module for cross-module focus management [dorsal attention network]
- Saliency Module for cross-module stimuli management [salience network];
- Workspace Module for live data condolidation to address the binding problem [frontoparietal network];
- Thinking Module for higher-level cognition, meta-association, simulation, and planning [frontal cortex, default mode network];
- A distributed, compositional, hierarchical associative database of SDRs, as a form of associative memory [hippocampus];
- Other optional modules, e.g. Digital Learning Modules to learn binary data, text encodings, and communication protocols from scratch, enabling text chat, agentic tool use, and machine interfacing.
In theory, this could all run within a single system process for minimal overhead (just like a videogame executable), with each module having one or more threads. Although, maybe that’s a bit beyond Monty and more leaning into “Vernon Operating System” territory… ![]()
While it may seem like yet another “hardcoded” cognitive architecture at first glance, all of these modules could operate with some form of cortical voting, so it can be characterized as a targeted scaffolding of human developmental priors that we acquired through both genetic evolution and self-domestication. Such a broad scaffolding might prove necessary to truly reach the threshold of emergent general intelligence.
(Well, maybe not all of that, since born-blind and born-deaf people can still be very smart, so I suppose getting a “deaf” Monty to reach the threshold would be a good indicator of success)
The most interesting point here that can be actioned upon in the shorter term is definitely the associative memory (hippocampus). I have a few ideas for proofs of concept about that, but I won’t dive into the subject here, it deserves its own separate thread.
The ideas described above are loosely based on the Global Neuronal Workspace model (aka Dehaene–Changeux model):
(source)
Within the realm of Thousand Brains theory, (A) can be imagined as Hierarchical Temporal Memory, (B) as Monty modules of course, and (C) as system-wide interactions of all the modules.
The exact intermodule pathways remain to be determined, but what I’m thinking is that the Workspace Module would be the centerpiece, where live SDRs from other modules are streamed to and associated together in Hebbian fashion. The Thinking Module would be latched on top of the Workspace Module, for meta-association capabilities, multi-SDR prediction, and among other things, affordances. That’s what I was alluding to when I talked about an “associative engine” in this post.
So far, no other project / system / AI out there has properly conceptualized this kind of systemic approach. I think the closest would be BrainCog, but unfortunately it’s all deep learning with narrow-AI modules and computationally-expensive biological neuron models.
The Question
My Q&A question was unfortunately cut short, and now that I’ve explained myself a bit, I ask the team again:
I’m aware this represents a monumental workload with plenty of unknowns to tackle, but I was wondering if this is the direction that TBP might be headed to, and if you’ve already started thinking about these kind of longer-term technical requirements and roadmap to gradually augment your framework toward a full-scale system?
Just trying to figure out where you draw the line in the sand. ![]()
