Thanks for the clarification, I understand your reasoning better now. It seems that motion parallax would be a good candidate to compute the depth information inside a visual LM without relying on the other LMs. I guess this capability will come later with the integration of temporal dynamics in your algorithm, but I understand that you can already use a temporary shortcut by directly feeding the depth information as a first step.
As for your speculation about where this computation happens in the brain, I would not target the extra sublayers in V1 L4 if we want to extend this capability to mice that don’t have this layer subdivision but still have many depth-selective neurons in V1 L2/3 (a recent reference that you probably already know about: A depth map of visual space in the primary visual cortex ). Said differently, the depth information could be a product of the canonical LM algorithm itself. That being said, maybe the extra sublayers in L4 can enhance this process but it would be nicer if they are not strictly required.
Personally, I feel that recognizing complex objects (like the shape of a coffee cup) in a rotation- & scale-invariant way at the level of a single LM is a strong bet. On my side, I have no clue how a cortical column could achieve this by itself (in fact, this is the main reason why I am so keen on relying on inter-LM interactions with “less capable”-LMs to recognize rotation- & scale-invariant cup-like objects). Still, I am curious to see where this path leads.
You mention some hypotheses about how it could be implemented in the brain. I currently have different speculations for the L6b modulatory feedbacks and a grid-cell-like phase coding mechanism in the neocortex:
-
Thalamo-cortical projections to L4 convey information about upcoming motor commands (either explicit like “contract a given neck muscle” or implicit when it is encoded as desired goals / expected outcomes like “turn the head 30° left”; the thalamus gets those signals from other cortical areas via their L5 PT projections, cerebellum or other subcortical motor nuclei like the superior colliculus) and also sensory stimuli for first-order thalamic nuclei. The L6 cortico-thalamic feedbacks of a cortical column dynamically adapt the gain of those motor-command-related thalamo-cortical projections in order to keep its reference frame in sync with the upcoming changes. When the reference frame is in sync with the upcoming changes (the prediction is accurate), then the thalamo-cortical activity is gated; if there is a difference, then only the delta is transmitted to the cortex.
-
The grid cell phase precession phenomenon (where grid cells fire at progressively earlier phases of the local theta rhythm as an animal moves through the spatial field of the grid cell) is something that could be at play in neocortical columns as well: imagine if a LM sequentially outputs 4 object IDs in each cycle (4 gamma periods inside an alpha period for the cortex, instead of 6-7 gamma periods inside a theta period for the medial entorhinal cortex where grid cells are). Those 4 sequential object IDs could represent dynamic trajectories of past, present and future of the matched object. There is already some evidence of this in the PFC but I haven’t found any such evidence for other cortical areas yet (wrong speculation or maybe not fully tested yet by experimentalists?).
Not sure to understand what you mean by gridcell-like mechanism in L6. For me, the analogy between the mEC and the neocortex is as follows: grid cells (primarily located in L2 of mEC) directly represent object IDs, similar to how neurons in L2/3 of a LM represent object IDs. The “object IDs” of mEC represent allocentric locations whereas “object IDs” of other cortical areas represent objects. Maybe I should use the term “concept ID” instead of “object ID” to make it clearer. L2/3 concept IDs related to allocentric, egocentric, arm-centric locations (computed in the temporal and parietal lobes) are then used by other LMs in their deep layers as cues for their reference frame computations. I know that mEC is an evolutionary-ancient cortex that has differences with the neocortex but I hypothesize that the main framework is still at play here (if you don’t agree with this, where would you put the limit in this “cortical continuum” from mesocortex to neocortex?). Also, I am keen on seeing an analogy between the potential “phase pinwheels” of grid cells in mEC with the orientation pinwheels in primate V1, but it is very speculative and I am going off-topic here.
Generally speaking, I am very interested in your understanding and vision of how the biological cortex works at a macro & micro level and the biological evidence that supports it. I know it is a lot to ask (obviously!), but I hope that we can have some insightful discussions about it here. Thanks again for doing this research in the open!