I am glad to see such an inspiring and open initiative like the TBP, and I hope it will drive a shift in interest from current passive AI to bio-inspired sensorimotor AI, ultimately helping us better understand the brain. During my exciting journey in understanding the brain, I’ve developed a personal perspective that sometimes diverges from the TBP viewpoint, even if I was largely inspired by Numenta ideas over the years. I believe it would be helpful to share my current understanding of the cerebral cortex to explore how we can mutually learn from our respective perspectives. This text is a work-in-progress, and I plan to continue refining it. It includes illustrations from the ebook I published in 2020, which contains the early seeds of these ideas. I would warmly welcome any critiques that help me refine or change my ideas.
Making sense of the cerebral cortex
Biology is inherently complex, with no set of simple rules that are perfectly adhered to. Exceptions are common, making counterexamples to some of the ideas presented here inevitable. The real challenge is to discern the fundamental principles from the irrelevant variations. Here are some speculations of such fundamental principles:
Unified mechanisms in isocortical and mesocortical areas
The same core mechanisms are at play in mammalian isocortical and mesocortical areas (allocortex like olfactory cortex, subiculum & hippocampus should be considered separately). In particular, the entorhinal cortex (where grid cells are) follows the same generic mechanisms, allowing to directly map its well-researched path integration mechanism to other cortical areas.
See details
Even in the neocortical 6-layer structure, there are significant variations along a granular-agranular axis. Compared to agranular cortices, granular cortices have a smaller thickness, a greater neuron density & number, and a large granular L4 giving them their name.
The composition of some layers also differs. Granular cortices tend to have a greater proportion of stellate cells than pyramidal cells in L2/3. Moreover, the nature of L4 cells is not the same in the primary visual cortex and the somatosensory cortex, two granular cortices.
The variation from granular to agranular forms a continuum across the cortex:
- Granular cortices for primary sensory areas (in red in the figure)
- Less granular cortices for higher sensory areas (in yellow)
- Even less granular cortices for associative and high-order areas (green & blue)
- Agranular cortices for motor areas (purple)
Those differences could be seen as variations of the same unified mechanisms with different hyperparameters (ratio of feature vs contextual inputs, spread of local inhibition, …).
The canonical cortical microcircuit could be grossly depicted as follow:
Main Processing Flows Between Cortical Areas
Cortical areas receive primary input mainly in layer 4, which is a granular layer located between the superficial and deep layers. These inputs come from two primary sources: the thalamus and other cortical areas. Notable exceptions are primary sensory cortices such as V1 (visual), S1 (somatosensory), and A1 (auditory) receive inputs exclusively from the thalamus. The processing between cortical areas follows two parallel pathways: a cortico-cortical pathway to communicate stable representations and a cortico-thalamo-cortical pathway to communicate motor commands.
See details
Cortico-Cortical Pathway to communicate Stable Representations
Superficial layers of one cortical area project to layer 4 of another cortical area and receive reciprocal projections in return. This reciprocal projection pattern forms a hierarchical communication network between cortical regions, but it doesn’t strictly follow levels of abstraction. For example, the prefrontal cortex, which is often associated with abstract reasoning, sends projections to the premotor cortex, involved in motor planning, as if the premotor cortex was higher in the hierarchy.
Cortico-Thalamo-Cortical Pathway to communicate motor commands
Layer 5 pyramidal tract (L5 PT) neurons in deep layers provide the cortical output targeting motor-related structures. Interestingly, these neurons also send “efference copies” of those motor commands to the thalamus, which in turn project to a higher-level cortical area following a similar hierarchy than the cortico-cortical pathway.
Note: The term “motor command” is used broadly here to refer to all outputs from the cortex to subcortical structures (excluding the thalamus and basal ganglia that receive other cortical output, see later). In some instances, this could be a direct motor command sent to motor neurons in the spinal cord (e.g., L5 PT cells of the motor cortex). It could also represent a desired goal sent to intermediary motor structures, such as the superior colliculus, which manages many raw eye movements. Beyond motor actions, it could also involve commands for internal actions, such as altering hormonal states by targeting the hypothalamus.
This dual-pathway architecture communicates both stable representation and efference copies of motor commands from lower to higher cortical areas.
Local lateral coupling in superficial layers vs long-distance lateral coupling in deep layers
Lateral neuron-level reciprocal excitatory connections are very informative about the kind of processing that occurs across the cortical sheet. Local lateral coupling in superficial layers highlights a regular lattice with a continuous elementary motif of about 0.5 mm. This is what I will refer to as a cortical column. Deep layers also exhibit lateral coupling, but over longer distances and in a less organized manner.
Lateral coupling supports persistent recurrent activity and facilitates reaching a common agreement on stable representations between nearby cortical columns in superficial layers, and on motor-related information between distant intra and inter-cortical areas in deep layers.
See details
Local Lateral Coupling in Superficial Layers
In superficial layers, neurons form local clusters that are strongly interconnected, typically spaced 0.5 mm apart across the cortical surface. Within these clusters, reciprocal excitatory connections are predominant. At a larger scale, these clusters form a hexagonal lattice pattern, with each hexagon representing a fundamental unit roughly 0.5 mm in size.
Interestingly, excitatory pyramidal neurons also exhibit localized inhibitory effects, resulting in a “locally winner-takes-all” (WTA) computation within the ~0.5 mm range. This structure implies that the overall hexagonal lattice could be modeled as a set of discrete, self-organized maps (SOMs), which interact to maintain a globally continuous representation.
Long-Distance Lateral Coupling in Deep Layers
Neurons in deep cortical layers often form long-distance reciprocal connections. This pattern may be attributed to the inside-out development of the cortex, where early-born neurons in the deep layers create a scaffold that guides subsequent formation of superficial layers. These early deep-layer neurons help establish long-range connectivity that supports further cortical organization.
If deep layers are involved in processing motor-related signals, their long-distance connections may help align functional activity across regions. For example, different regions within the primary visual cortex (V1), or between V1 and V2 need to coordinate their activities—such as synchronizing visual inputs from multiple areas or aligning with shared processes like eye movements.
Note: The term “lateral” here is used abstractly to indicate connectivity between neurons in the same cortical layer. While the local lateral connections are physically horizontal, long-distance lateral connections involve axons leaving the cortex and re-entering through fiber tracts in the white matter.
A modular functional organization in 2D maps
In superficial layers, cortical activity organizes itself both structurally and functionally into a tangential continuous regular lattice whose fundamental units could be seen as elementary 2D maps of ~0.5mm diameter. Deep layers do not have such a structural organization, but may inherit such a functional organization from their radial interconnections with superficial layers. Extending this functional organization from superficial layers to the whole column leads to the concept of functional cortical columns.
Those elementary 2D maps encode “stable representations” of upcoming features on a continuous 2D map: nearby neurons encode representations that share the same principal components but can still differ in their other components like their temporal context. Pinwheels in primate primary visual cortex V1 are examples of such elementary 2D maps.
See details
2D maps in V1
Many neurons in superficial layers of V1 are orientation-selective, meaning they increase their firing rate for specific angles of visual line stimuli. In cats and primates, such neurons are anatomically continuously organized on a 2D map forming pinwheel motifs.
In cats and primates, 2D maps seem to be strongly driven by orientation but other variables may also be at play (spatial frequency, color, ocular dominance, direction and motion speed). In mice that do not have orientation columns, 2D maps could be driven by a combination of motion speed, direction and spatial frequency that follow continuous local gradients.
2D maps in V4
Curvature domains in V4 of primate: see “Curvature domains in V4 of macaque monkey” https://elifesciences.org/articles/57261.pdf
Large-scale calcium imaging reveals a systematic V4 map for encoding natural scenes https://www.nature.com/articles/s41467-024-50821-z.pdf
2D maps in IT
Face patches in the inferior temporal (IT) cortex, where nearby neurons encode similar viewpoints of faces (frontal, profile or oblique views).
2D maps in EC / grid cells
“Grid cells in module 1 are represented in nine colors based on their spatial tuning phases and distributed in an anatomical phase lattice with six repeating units”
A Map-like Micro-Organization of Grid Cells in the Medial Entorhinal Cortex - ScienceDirect
2D maps elsewhere
Experimentalists haven’t found such continuous 2D maps everywhere in the cortex. It does not mean that those do not exist. There are solid reasons why those may be hard to detect:
- While such continuous features are stimulus-related in primary sensory cortices, they become more behavior-related in other cortices. Behavior is not that easy to test in experimental conditions.
- When the feature space is multidimensional, the principal components shown in the 2D map are harder to explain (latent space is not a direct mapping to easy-to-understand features).
Grid cells organized in 2D maps representing their phase provide a remarkable example where the latent space aligns perfectly with a behavior that is both easy to measure and easy to understand: the animal’s position.
Feature recognition and path integration in 2D maps
Elementary 2D maps in superficial layers are driven by input features (e.g. raw visual stimuli) and contextual inputs (e.g. upcoming motor commands like head tilts) to learn both feature recognition & contextual feature binding. When the stream of input features stops or becomes unreliable, the learned binding gives the update on the 2D map via path integration. The path integration mechanism leverages the continuous nature within 2D maps and at the borders between neighboring maps (explaining why we need several 2D maps in a single grid cell module, but could be bypassed in modelisation if we enforce a toroidal topology).
See details
In the input features regime, the 2D map acts as a self-organizing map (SOM) that learns to cluster similar input features together on the map (not necessarily classic Kohonen SOM).
However, we need to add other dimensions to this map in order to handle the contextual inputs regime. I hypothesize that each label on the map is modeled by a minicolumn of several neurons and follow a similar algorithm than the Temporal Memory of Numenta. If contextual inputs are reset or fuzzy, all neurons of the selected label activate. If only one neuron fires, it means that the map anticipates the next label update. Path integration is realized via those contextual updates. The main difference with the classic Temporal Memory algorithm is that nearby minicolumns represent features with similar principal components.
https://www.numenta.com/assets/pdf/temporal-memory-algorithm/Temporal-Memory-Algorithm-Details.pdf
Lateral voting between neighboring 2D maps
Neighboring elementary 2D maps are coupled to mutually influence and enrich their feature recognition process (ability to recognize more than what they could have sensed in their own receptive field). Such mutual lateral voting via horizontal connections in superficial layers is locally restricted to a few millimeters (a few dozens of direct neighbors at most).
This is materialized by the local lateral coupling interactions described before.
Reverse path integration
Top-down control can force an activation on the 2D map. In this case, the path integration mechanism works in the other direction: what motor command should be sent in order to get input features corresponding to this activation. For instance, top-down control can lock the current 2D map activations in V1 while watching a flying bird, enabling the generation of necessary head movements to track the bird. This “reverse path integration” realized in deep layers can be decomposed in two steps:
- First, find the contextual inputs that would lead to the forced activation on the 2D map.
- Then, find the corresponding motor command to send to subcortical structures.
Those processes in deep layers are not confined to a cortical column: they also rely on cortico-thalamo-cortical loops and the long-distance lateral coupling in deep layers.
Abstract motor commands
Every fundamental unit of the cortex includes a set of pyramidal tract (PT) neurons that send motor commands to subcortical structures. For cortical areas often associated with abstract reasoning, such as the dorsolateral prefrontal cortex (dlPFC), the link to motor commands is less obvious. However, like other areas, PT neurons in the dlPFC project significantly to the cerebellum (via the pontine nuclei), which, in turn, projects back to the cortex (via the thalamus). This loop can operate without generating effective motor commands, effectively “repurposing” the cerebellum from its traditional motor-related role. Could it be a plausible explanation? This would enable path integration and reverse path integration on pure abstract concepts, such as applying abstract analogies on input features. Thoughts as movements.
Going back to the TBP
This is the high-level story of the cerebral cortex that currently makes the most sense to me. Although many aspects remain mysterious, I find this conceptual framework useful for interpreting experimental findings. Additionally, we could use it to make predictions and evaluate its robustness.
How does it align with the ideas of the TBP? Most of it is transversal to the TBP, which focuses more on the communication protocol between cortical columns than on biologically inspired modeling of a single cortical column. However, there are several points that could be interesting to explore:
- Dimensionality of representations: If the hypothesis about continuous elementary 2D maps being central to path integration and reverse path integration is correct, then the sensorimotor capability of a single cortical column would be constrained by a low-dimensional bottleneck (the 2D map) between sensory inputs and motor outputs. This does not imply that superficial layers are incapable of recognizing or communicating high-dimensional representations. Such high-dimensional representations play a crucial role in enabling more efficient hierarchical or heterarchical processing, achieving similar outcomes in fewer steps. However, I view these high-dimensional representations as a secondary principle in cortical processing, built upon the foundational role of low-dimensional maps.
- Degree of specialization of each cortical column: If we view the cortex as a multi-agent system where each agent possesses its own knowledge, the degree of knowledge overlap between agents becomes a fundamental parameter. This can be visualized as a continuum with two extremes: on one end, each agent is the sole expert in its domain; on the other, each agent models the entirety of the system’s knowledge. Where should we place the cursor on this spectrum? The locality of lateral coupling between cortical columns in superficial layers suggests a “local” voting process, where a cortical column collaborates primarily with only a few dozen neighboring columns. In contrast, from my understanding, the TBP leans further toward the other end of the continuum, as it implies that even a cortical column from V1 could recognize and contribute to identifying complex objects, such as a cup.
I would be glad to know what you think about this. Happy to discuss it more and clarify the points that are unclear. I wish I had more time to draw new illustrations to help the understanding, but I’ll try to add them in the future.