Exploring the fundamental principles of the cerebral cortex

I am glad to see such an inspiring and open initiative like the TBP, and I hope it will drive a shift in interest from current passive AI to bio-inspired sensorimotor AI, ultimately helping us better understand the brain. During my exciting journey in understanding the brain, I’ve developed a personal perspective that sometimes diverges from the TBP viewpoint, even if I was largely inspired by Numenta ideas over the years. I believe it would be helpful to share my current understanding of the cerebral cortex to explore how we can mutually learn from our respective perspectives. This text is a work-in-progress, and I plan to continue refining it. It includes illustrations from the ebook I published in 2020, which contains the early seeds of these ideas. I would warmly welcome any critiques that help me refine or change my ideas.

Making sense of the cerebral cortex

Biology is inherently complex, with no set of simple rules that are perfectly adhered to. Exceptions are common, making counterexamples to some of the ideas presented here inevitable. The real challenge is to discern the fundamental principles from the irrelevant variations. Here are some speculations of such fundamental principles:

Unified mechanisms in isocortical and mesocortical areas

The same core mechanisms are at play in mammalian isocortical and mesocortical areas (allocortex like olfactory cortex, subiculum & hippocampus should be considered separately). In particular, the entorhinal cortex (where grid cells are) follows the same generic mechanisms, allowing to directly map its well-researched path integration mechanism to other cortical areas.

See details

Even in the neocortical 6-layer structure, there are significant variations along a granular-agranular axis. Compared to agranular cortices, granular cortices have a smaller thickness, a greater neuron density & number, and a large granular L4 giving them their name.

The composition of some layers also differs. Granular cortices tend to have a greater proportion of stellate cells than pyramidal cells in L2/3. Moreover, the nature of L4 cells is not the same in the primary visual cortex and the somatosensory cortex, two granular cortices.

The variation from granular to agranular forms a continuum across the cortex:

  • Granular cortices for primary sensory areas (in red in the figure)
  • Less granular cortices for higher sensory areas (in yellow)
  • Even less granular cortices for associative and high-order areas (green & blue)
  • Agranular cortices for motor areas (purple)

Those differences could be seen as variations of the same unified mechanisms with different hyperparameters (ratio of feature vs contextual inputs, spread of local inhibition, …).

The canonical cortical microcircuit could be grossly depicted as follow:

Main Processing Flows Between Cortical Areas

Cortical areas receive primary input mainly in layer 4, which is a granular layer located between the superficial and deep layers. These inputs come from two primary sources: the thalamus and other cortical areas. Notable exceptions are primary sensory cortices such as V1 (visual), S1 (somatosensory), and A1 (auditory) receive inputs exclusively from the thalamus. The processing between cortical areas follows two parallel pathways: a cortico-cortical pathway to communicate stable representations and a cortico-thalamo-cortical pathway to communicate motor commands.

See details

Cortico-Cortical Pathway to communicate Stable Representations

Superficial layers of one cortical area project to layer 4 of another cortical area and receive reciprocal projections in return. This reciprocal projection pattern forms a hierarchical communication network between cortical regions, but it doesn’t strictly follow levels of abstraction. For example, the prefrontal cortex, which is often associated with abstract reasoning, sends projections to the premotor cortex, involved in motor planning, as if the premotor cortex was higher in the hierarchy.

Cortico-Thalamo-Cortical Pathway to communicate motor commands

Layer 5 pyramidal tract (L5 PT) neurons in deep layers provide the cortical output targeting motor-related structures. Interestingly, these neurons also send “efference copies” of those motor commands to the thalamus, which in turn project to a higher-level cortical area following a similar hierarchy than the cortico-cortical pathway.

Note: The term “motor command” is used broadly here to refer to all outputs from the cortex to subcortical structures (excluding the thalamus and basal ganglia that receive other cortical output, see later). In some instances, this could be a direct motor command sent to motor neurons in the spinal cord (e.g., L5 PT cells of the motor cortex). It could also represent a desired goal sent to intermediary motor structures, such as the superior colliculus, which manages many raw eye movements. Beyond motor actions, it could also involve commands for internal actions, such as altering hormonal states by targeting the hypothalamus.

This dual-pathway architecture communicates both stable representation and efference copies of motor commands from lower to higher cortical areas.

Local lateral coupling in superficial layers vs long-distance lateral coupling in deep layers

Lateral neuron-level reciprocal excitatory connections are very informative about the kind of processing that occurs across the cortical sheet. Local lateral coupling in superficial layers highlights a regular lattice with a continuous elementary motif of about 0.5 mm. This is what I will refer to as a cortical column. Deep layers also exhibit lateral coupling, but over longer distances and in a less organized manner.

Lateral coupling supports persistent recurrent activity and facilitates reaching a common agreement on stable representations between nearby cortical columns in superficial layers, and on motor-related information between distant intra and inter-cortical areas in deep layers.

See details

Local Lateral Coupling in Superficial Layers

In superficial layers, neurons form local clusters that are strongly interconnected, typically spaced 0.5 mm apart across the cortical surface. Within these clusters, reciprocal excitatory connections are predominant. At a larger scale, these clusters form a hexagonal lattice pattern, with each hexagon representing a fundamental unit roughly 0.5 mm in size.

Interestingly, excitatory pyramidal neurons also exhibit localized inhibitory effects, resulting in a “locally winner-takes-all” (WTA) computation within the ~0.5 mm range. This structure implies that the overall hexagonal lattice could be modeled as a set of discrete, self-organized maps (SOMs), which interact to maintain a globally continuous representation.

Long-Distance Lateral Coupling in Deep Layers

Neurons in deep cortical layers often form long-distance reciprocal connections. This pattern may be attributed to the inside-out development of the cortex, where early-born neurons in the deep layers create a scaffold that guides subsequent formation of superficial layers. These early deep-layer neurons help establish long-range connectivity that supports further cortical organization.

If deep layers are involved in processing motor-related signals, their long-distance connections may help align functional activity across regions. For example, different regions within the primary visual cortex (V1), or between V1 and V2 need to coordinate their activities—such as synchronizing visual inputs from multiple areas or aligning with shared processes like eye movements.

Note: The term “lateral” here is used abstractly to indicate connectivity between neurons in the same cortical layer. While the local lateral connections are physically horizontal, long-distance lateral connections involve axons leaving the cortex and re-entering through fiber tracts in the white matter.

A modular functional organization in 2D maps

In superficial layers, cortical activity organizes itself both structurally and functionally into a tangential continuous regular lattice whose fundamental units could be seen as elementary 2D maps of ~0.5mm diameter. Deep layers do not have such a structural organization, but may inherit such a functional organization from their radial interconnections with superficial layers. Extending this functional organization from superficial layers to the whole column leads to the concept of functional cortical columns.

Those elementary 2D maps encode “stable representations” of upcoming features on a continuous 2D map: nearby neurons encode representations that share the same principal components but can still differ in their other components like their temporal context. Pinwheels in primate primary visual cortex V1 are examples of such elementary 2D maps.

See details

2D maps in V1

Many neurons in superficial layers of V1 are orientation-selective, meaning they increase their firing rate for specific angles of visual line stimuli. In cats and primates, such neurons are anatomically continuously organized on a 2D map forming pinwheel motifs.

In cats and primates, 2D maps seem to be strongly driven by orientation but other variables may also be at play (spatial frequency, color, ocular dominance, direction and motion speed). In mice that do not have orientation columns, 2D maps could be driven by a combination of motion speed, direction and spatial frequency that follow continuous local gradients.

2D maps in V4

Curvature domains in V4 of primate: see “Curvature domains in V4 of macaque monkey” https://elifesciences.org/articles/57261.pdf

Large-scale calcium imaging reveals a systematic V4 map for encoding natural scenes https://www.nature.com/articles/s41467-024-50821-z.pdf

2D maps in IT

Face patches in the inferior temporal (IT) cortex, where nearby neurons encode similar viewpoints of faces (frontal, profile or oblique views).

2D maps in EC / grid cells

“Grid cells in module 1 are represented in nine colors based on their spatial tuning phases and distributed in an anatomical phase lattice with six repeating units”

A Map-like Micro-Organization of Grid Cells in the Medial Entorhinal Cortex - ScienceDirect

2D maps elsewhere

Experimentalists haven’t found such continuous 2D maps everywhere in the cortex. It does not mean that those do not exist. There are solid reasons why those may be hard to detect:

  • While such continuous features are stimulus-related in primary sensory cortices, they become more behavior-related in other cortices. Behavior is not that easy to test in experimental conditions.
  • When the feature space is multidimensional, the principal components shown in the 2D map are harder to explain (latent space is not a direct mapping to easy-to-understand features).

Grid cells organized in 2D maps representing their phase provide a remarkable example where the latent space aligns perfectly with a behavior that is both easy to measure and easy to understand: the animal’s position.

Feature recognition and path integration in 2D maps

Elementary 2D maps in superficial layers are driven by input features (e.g. raw visual stimuli) and contextual inputs (e.g. upcoming motor commands like head tilts) to learn both feature recognition & contextual feature binding. When the stream of input features stops or becomes unreliable, the learned binding gives the update on the 2D map via path integration. The path integration mechanism leverages the continuous nature within 2D maps and at the borders between neighboring maps (explaining why we need several 2D maps in a single grid cell module, but could be bypassed in modelisation if we enforce a toroidal topology).

See details

In the input features regime, the 2D map acts as a self-organizing map (SOM) that learns to cluster similar input features together on the map (not necessarily classic Kohonen SOM).

However, we need to add other dimensions to this map in order to handle the contextual inputs regime. I hypothesize that each label on the map is modeled by a minicolumn of several neurons and follow a similar algorithm than the Temporal Memory of Numenta. If contextual inputs are reset or fuzzy, all neurons of the selected label activate. If only one neuron fires, it means that the map anticipates the next label update. Path integration is realized via those contextual updates. The main difference with the classic Temporal Memory algorithm is that nearby minicolumns represent features with similar principal components.

https://www.numenta.com/assets/pdf/temporal-memory-algorithm/Temporal-Memory-Algorithm-Details.pdf

Lateral voting between neighboring 2D maps

Neighboring elementary 2D maps are coupled to mutually influence and enrich their feature recognition process (ability to recognize more than what they could have sensed in their own receptive field). Such mutual lateral voting via horizontal connections in superficial layers is locally restricted to a few millimeters (a few dozens of direct neighbors at most).

This is materialized by the local lateral coupling interactions described before.

Reverse path integration

Top-down control can force an activation on the 2D map. In this case, the path integration mechanism works in the other direction: what motor command should be sent in order to get input features corresponding to this activation. For instance, top-down control can lock the current 2D map activations in V1 while watching a flying bird, enabling the generation of necessary head movements to track the bird. This “reverse path integration” realized in deep layers can be decomposed in two steps:

  • First, find the contextual inputs that would lead to the forced activation on the 2D map.
  • Then, find the corresponding motor command to send to subcortical structures.

Those processes in deep layers are not confined to a cortical column: they also rely on cortico-thalamo-cortical loops and the long-distance lateral coupling in deep layers.

Abstract motor commands

Every fundamental unit of the cortex includes a set of pyramidal tract (PT) neurons that send motor commands to subcortical structures. For cortical areas often associated with abstract reasoning, such as the dorsolateral prefrontal cortex (dlPFC), the link to motor commands is less obvious. However, like other areas, PT neurons in the dlPFC project significantly to the cerebellum (via the pontine nuclei), which, in turn, projects back to the cortex (via the thalamus). This loop can operate without generating effective motor commands, effectively “repurposing” the cerebellum from its traditional motor-related role. Could it be a plausible explanation? This would enable path integration and reverse path integration on pure abstract concepts, such as applying abstract analogies on input features. Thoughts as movements.


Going back to the TBP

This is the high-level story of the cerebral cortex that currently makes the most sense to me. Although many aspects remain mysterious, I find this conceptual framework useful for interpreting experimental findings. Additionally, we could use it to make predictions and evaluate its robustness.

How does it align with the ideas of the TBP? Most of it is transversal to the TBP, which focuses more on the communication protocol between cortical columns than on biologically inspired modeling of a single cortical column. However, there are several points that could be interesting to explore:

  • Dimensionality of representations: If the hypothesis about continuous elementary 2D maps being central to path integration and reverse path integration is correct, then the sensorimotor capability of a single cortical column would be constrained by a low-dimensional bottleneck (the 2D map) between sensory inputs and motor outputs. This does not imply that superficial layers are incapable of recognizing or communicating high-dimensional representations. Such high-dimensional representations play a crucial role in enabling more efficient hierarchical or heterarchical processing, achieving similar outcomes in fewer steps. However, I view these high-dimensional representations as a secondary principle in cortical processing, built upon the foundational role of low-dimensional maps.
  • Degree of specialization of each cortical column: If we view the cortex as a multi-agent system where each agent possesses its own knowledge, the degree of knowledge overlap between agents becomes a fundamental parameter. This can be visualized as a continuum with two extremes: on one end, each agent is the sole expert in its domain; on the other, each agent models the entirety of the system’s knowledge. Where should we place the cursor on this spectrum? The locality of lateral coupling between cortical columns in superficial layers suggests a “local” voting process, where a cortical column collaborates primarily with only a few dozen neighboring columns. In contrast, from my understanding, the TBP leans further toward the other end of the continuum, as it implies that even a cortical column from V1 could recognize and contribute to identifying complex objects, such as a cup.

I would be glad to know what you think about this. Happy to discuss it more and clarify the points that are unclear. I wish I had more time to draw new illustrations to help the understanding, but I’ll try to add them in the future.

7 Likes

Oh, wow. This is one of the best cerebral-centric posts I’ve seen in a good long while. I’m on a similar journey as you, it seems, so I’m 100% bookmarking this. Also, I’ve only just now begun skimming, but let me see if I don’t have any thoughts/questions regarding this. One moment…

Thoughts & Notes:
…It seems like your intention was to focus mostly on the neocortex, though you had made mention of the cerebellum towards the end there. If we’re doing that we’ll probably want to also make mention of the hippocampal complex/trisynaptic circuit too. To my understanding, where the thalamus/pulvinar provide a ‘global attention’ mechanism to the brain, the hippocampal complex provides it’s global positioning relative to the environment (compared to, say, each cortical column providing sensor-specific positioning).

If you decided to include a section on the hippocampal complex’s functions, you’d probably want to make special mention not only of its spatial mapping via EC/hippocampus-proper, but also pattern separation/completion through CA3 > CA1, as well as its indexing of columnar space.

I tried mapping rudimentary information flows throughout the space some time back. Perhaps you’ll find it helpful?

Here are some hippo-specific papers I found useful. You may like them too:
http://people.whitman.edu/~herbrawt/hippocampus.pdf

Also, fun fact #1: Ever wonder why the brain uses hexagonal maps instead of something like Cartesian? Well, a circle has the lowest ratio but cannot tessellate to form continuous grids. Hexagons are the most circular-shaped polygon that can tessellate to form an evenly spaced grid. As such, this circularity allows hexagonal grids to represent curves in the patterns of data much more naturally than that square grid would.

…Regarding global attentional/predictive mechanisms via cortico-thalamic looping, you might find this useful:
https://direct.mit.edu/jocn/article/33/6/1158/98116/Deep-Predictive-Learning-in-Neocortex-and-Pulvinar

…In regards to your “Abstract motor commands” section, the way its been explained to me is that the cerebellum gets used to process egocentric-to-allocentric transforms of information. As an example, Let’s imagine that you are in a pack of wolves hunting prey, and you need to understand both where the prey is, but where your packmates are relative to you and that prey. Your senses (ala cortical columns) don’t feed you the information necessary to understand where all these pieces are and make global predictions about them, particularly relative to the terrain. To deal with this, organisms create a generic map which kind of puts all that information into a top down map, by extracting “you” out of it and making assumptions about the behavior of all the other important parts (think of the difference between a first person shooter and top down real time strategy video game). That’s the cerebello-cerebral in a nutshell.

Fun Fact #2: There’s two classes of animals which tend to have insane cerebellums. Animals which use sensory methods other than vision to navigate space (like electricity for elephant fish or sound for bats/dolphins) and hyper-social animals.

Oh, also! You had mentioned the pontine pathway. If you do that I think you’ll also want to make note of the inferior olive pathway too, as it carries error Signalling and timing information away from the cerebellum back to the fourth ventricle where its integrates back up to broader cortical space. The pons/olive mechanics kind of need one another to function properly, at least in my eyes.

…As to the “degree of specialization” topic found at the very end, I would say its a bit of both. Ultimately, the columns input dictates its domain. Though I wouldn’t go so far as to say sole expert. Cohorts of expertise maybe. To a column, an input is an input. It doesn’t really matter where that input is sourced from, it all gets processed in the same fashion. In this way, a column from one modality may actually be made to process the input from an entirely different input stream. This is one reason why people who suffer from things like brain aneurisms can eventually recover.

That said, there are neural mechanics which seem to lead to expertise-like behaviors within columns. Given enough world data neuronal populations become extremely good at processing certain kinds of input. For instance, the columns of lower-level layers of the neocortex may start by understanding only basic inputs. For our example, lets say they can only process individual characters, then the next layer may process entire words, then sentences, and finally whole sections of writing.

But with enough time, understanding that is typically reserved for higher up the convolutional hierarchy may get learned by lower levels and thus delegated down to those layers. Now the bottom-most layers, which originally started by only processing characters, can now process words > sentences > whole chunks. This is what literal expertise in the brain is. Greater consolidation of data at lower levels leads to fewer hops between processing units and faster motor-behavioral response times.

Questions:

  • Have you already worked signal interlacing into your model of cortical processing? I’ve been trying to wrap my head around how a system that has more than two broad sensors (e.g. eyes) might interlace their initial inputs.

  • Are you able to expand on the “dimensionality of representations” point you made there towards the end? Because the way I’ve internalized it is that the cortex seeks to reduce high-dimensional representations as to avoid a kind of biological “curse of dimensionality” as we sometimes get in ML. My intuition says to view something like the default mode network almost as a kind of resting-state manifold learner in this regard. But yeah, I’d love to talk more on this point specifically.

4 Likes

Thanks @HumbleTraveller, for sharing your thoughts and the links!

In my previous post, I aimed to keep the discussion as general as possible regarding the cerebral cortex. While I do have ideas about the hippocampus, basal ganglia, and cerebellum, these are independent of the thalamo-cerebral-centric concepts described here.

I haven’t conducted simulations on signal interlacing, but here’s how I would approach it conceptually if I understand your question correctly.

The receptive field of a cortical column is dynamic, not fixed or sharply delimited. Each cortical column can potentially sample inputs from a larger receptive field, encompassing the union of receptive fields of neighboring columns. If an input does not contribute meaningfully to object recognition, it is discarded and replaced with a new one.

When applied to columns in V1, inputs from the same eye are more likely to cluster together because they provide more coherent and related information for recognizing objects. This clustering could lead to the emergence of ocular dominance columns.

In the context of a classic Self-Organizing Map (SOM), one could evaluate the usefulness of inputs using the standard deviation of each value in the prototype vectors. Inputs with usefulness below a predefined threshold would be periodically discarded. The discarded inputs could then be replaced through a sampling process, either random or biased. For example, an input discarded by one column might be a strong candidate for another column, which could promote the specialization of neighboring columns.

My reasoning is that path integration and reverse path integration are more computationally efficient and provide better generalization when performed in low-dimensional spaces. By operating in a reduced-dimensionality framework, the system avoids unnecessary complexity, which aligns with the biological need for energy efficiency and robustness.

That said, high-dimensional representations are still valuable for certain aspects of cortical processing (richer feature encoding, hierarchical/heterarchical processing, recognizing patterns…). However, when it comes to motor output, it seems reasonable to assume that the commands generated by individual cortical columns don’t require very high-dimensional representations. Motor actuators typically operate in a constrained space (e.g., joint angles, muscle activations), and low-dimensional representations might be sufficient to guide them effectively.

1 Like

Thank you for your response @mthiboust. I’ve been looking forward to learning more about your framework/thoughts!

I want to let you know that I’m currently going through your ebook (much impressed btw). I’m about half-way through it now and will try to shape my language and frame-of-mind to it in our continued discussions. Also, duly noted on constraining the discussion to cortico-thalamic specific concepts. I’ll do my best to curb my excitement! :grimacing:

To start, the way you describe path integration and reverse path integration reminds me a lot of a dueling-systems type approach to behavior. Specifically, one in which the process of path integration represents a glutamate-based feedforward system (which is sensory/allocentrically focused). This process/system would then provide to us our input mapping.

The second system then, your process of reverse path integration, would be a GABA-based feedback one (one which is more salience focused and egocentric in its processing). This would provide us our context mapping.

These two mappings then (input + context) would integrate thalamically, driving behavioral-response (ergo your resulting motor-command signalling). Is this at all aligned with the your framework’s perspective?

Also, if visualizing helps, I did try to diagram these interactions out:

Note: Please try to view “Higher-order consciousness” as shown in the above diagram as something akin to metacognition. The diagrams are strongly influenced by Edelman, who used the term somewhat loosely.

Note 2: I don’t find your views on abstract motor-commands implausible at all. In fact, I was under the impression that was the working consensus; that there was no functional difference between motor-action and thought. Not so much “thoughts as movement,” but rather “thoughts are movement.”

But anyways, if the above is even remotely close to how your framework perceives cognitive processing, then our views on it may be closely aligned. Something I’ve been trying to understand, however, is precisely how a column works to reduce this dimensional complexity. @nleadholm had suggested that perhaps this function emerges naturally, through scale. But what are your thoughts here? Do you have a working hypothesis on how this reduction might be achieved?

1 Like

Thank you very much @mthiboust for your detailed post. I thought it was a nice overview of a lot of neuroscience, and it was interesting to get a bit more information on your ideas around path integration in the superficial layers of cortex. You’ve clearly put an impressive amount of thought into your book.

On the path integration point, we currently hold a different view on the purpose of path integration in cortical columns. Rather than supporting path integration through a 2D feature space, we believe that individual cortical columns support path integration through object-centric reference frames, and this is reflected in how we have built Monty. As such, I would disagree with “[the TBP] focuses more on the communication protocol between cortical columns than on biologically inspired modeling of a single cortical column”. While we currently don’t use components like HTM within our learning modules, their internal design and operation is definitely informed by how we believe individual cortical columns function. The neuroscience basis of this is covered in detail in Lewis et al, 2019. You might also find our FAQ interesting, for example on what exactly we think an object in V1 would look like. The discussion under Alternative Approaches to Intelligence also highlights the importance of object-centric representations to human intelligence, which other forms of AI fail to establish. This is one of the key reasons for object-centric reference frames within a cortical column, and it’s not clear to me how such representations would emerge with the 2D feature maps you describe, given that CNNs form similar maps.

I would find it helpful if you were able to expand on the path integration properties you describe in more detail, as that seems like the key proposal. Some exploratory questions:

  • How can this actually be used to predict inputs in the world based on movement of a sensor, if they are operating over a 2D feature space rather than the space of an object?
  • What kind of path integration is actually supported, i.e. novel, never before-seen paths?
  • Do all edges between possible feature pairs need to be visited in order for path integration to work?
  • If you are currently observing a feature that has been seen within many objects, how do you correctly predict the next feature, without requiring a combinatorial number of associative connections?
  • What if the eye makes a large saccade across an object, rather than a small one?

Looking forward to discussing it further. It might be helpful to include a diagram something like the below ones from Hawkins et al 2017 and Lewis et al 2019, but showing your proposal:


@HumbleTraveller re. dimensionality, thanks for your question. I would summarize our view as cortical columns build representations using low-dimensional (3D or less) reference frames. Features that are bound to locations within a reference frame can, however, be very high dimensional, such as SDRs. These would both be relatively innate properties of the microcircuit anatomy. I don’t recall what I’ve said about scale on this question, but I’m happy to clarify if you can link to that post.

3 Likes

Hey there @nleadholm,

This was from a while ago, but let me think. In another post (I can’t recall which one), I had asked you about the plausibility of the Default Mode Network serving as a kind of Resting-state Manifold learner, a function present in all cortical columns.

You had then suggested that perhaps the ‘self-reflective characteristics’ of the DMN might be an emergent phenomena, appearing naturally through scale. However, I think I disagree with this. I disagree because I think what’s happening during DMN activation is in fact happening all over the cortex, however, we’re more acutely aware of its functioning in the PFC regions as those particular regions aren’t constantly burdened with the processing of extrasensory information, as say, the parietal lobe might be.

My thinking is that the very same mechanics which lead to things like theory of mind also lead to things like route consolidation in navigation-based memory tasks, as discussed by Yang et al.

But moreover, I was wondering if the same mechanics that might lead you to categorize seeming disparate social representations into a single, simplified categorical representation (Ex: Vivian; Jeff → “colleagues”), wasn’t also responsible for doing the same with non-social representations (Ex: Coffee cup; Wine glass → “vessel”). This then led to my question, how does the brain come to reduce known models (which may contain arbitrarily high feature sets), into a lower dimensional representation?

Thanks for clarifying.

Re. the DMN, yes I also agree that whatever the DMN reflects (modeling of internal states etc) is happening in each column. By scale I only meant that a Monty system would need to be large-scale for it to have system-wide activity that looked like the DMN in the brain - since the latter is by definition an observation of large-scale dynamics. Hope that’s clear.

Re. categorization: this is an interesting question and something you will generally here us discuss as a question of “categorization”, rather than using the words dimensionality (given the latter applies in so many settings). We are shifting the focus of our research to unsupervised learning (see the Future Work entries), and one thing we expect to observe is the merging of morphologically similar objects (e.g. different coffee cups) into single models. However, categorization is a complex thing with aspects beyond just morphology. For example, the commonality between coffee cups and wine glasses is more about affordances and their ability to hold fluids than it is about their morphology. Getting such grouping working is likely going to depend on us developing a better understanding of object behaviors, which is a work in progress.

1 Like

Gotcha. By “dimensionality,” I was referring to the feature space which helps define a model, not neccasarily the model itself.

I feel like “possessing a morphology” would almost be a category unto itself. The way you guys are approaching modelling space leaves room for a lot of flexibility, which is nice. I talked about it briefly in that one policy post of mine, but I feel like you can probably handle most representations through the lens of ‘features, models and movement.’ Ultimatly, I think if you guys really focus on getting Monty’s interactions with “physical space” right, all the other more abstract spaces we can imagine (linguistic space, mathematics, social network space) will kind of just be solved emergently.

Also, I’m not sure if you’ll find this helpful, but there was an paper I was looking into some time back, regarding the Anterior Temporal and Posterior Medial systems and how they might be used to assign value to objects within a given scene. Essentially a framework for memory-guided behaivor. I was going to use it to help inform how that “Salience Mapping” policy of mine from the other post might work, but maybe it would help you guys in brainstormning your approach to unsupervised learning? Anyways, if you’re interested:

Out of curiousity, how far into implementing unsupervised learning are you?

3 Likes

Great conversations here!
@HumbleTraveller you can follow our progress on unsupervised learning by keeping an eye on our benchmark experiment results for this here: https://thousandbrainsproject.readme.io/docs/benchmark-experiments#unsupervised-learning
From the beginning on we designed Monty to learn continually and unsupervised (as outlined here: How Learning Modules Work )
If you want to test the unsupervised learning capabilities yourself or just read a bit more about it, we have a tutorial specifically on that: Unsupervised Continual Learning
If you are curious about our next steps to improve performance in unsupervised setups you can have a look at our roadmap Monty Project Overview - Google Sheets (the fields colored in pink)
Hope this helps!

2 Likes