2025/07 - Flash Inference and Voting

@jhawkins leads a discussion on the problem of flash inference, how voting partly solves this, and the role of regional sensor locality.

00:00
Introduction
00:08 Saccades, Micro-saccades, Drift
06:57 The Problem of Flash Inference
09:40 Voting for Flash Inference
22:13 Fully / Partially Independent Assumptions.
38:17 Agents and Sensors
45:03 Sensor Locality
53:07 Sending Movements vs Locations in Monty

2 Likes

At the risk of sounding naive, I have to ask: doesn’t the cerebellum serve as a kind of global registry function? Not in the sense of a literal central repository (I agree that its a network function), but in that it provides the when and how, which the cortex can then later use to derive the where across more distant cortical areas?

That’s an interesting perspective. If I understand you correctly, are you suggesting that the cerebellum helps coordinate across distant cortical columns, e.g., by sharing location information and supporting some form of voting on object ids? Could you clarify this a bit further, perhaps with a concrete example of what you mean by the “when” and “how,” and how the “where” would be derived from it?

1 Like

The cerebellum is essentially a feedforward control system, mostly responsible for time-critical planning (e.g. motor commands for the path integration of saccades). It doesn’t directly connect back to the columns, but it does sends signals through the thalamus, which indirectly loops back to the neocortex, although the interactions are complex and not well understood. Perhaps it could be theorized that it broadcasts back a movement prediction vector / efference copy or synchronization signal?

1 Like

That makes sense — thank you much :smiling_face:

If we think of those thalamic relays as broadcasting prediction vectors and synchronization signals, couldn’t that provide the scaffolding other regions (say parietal or hippocampal circuits) need to stabilize ego-to-allocentric transforms? I’m wondering if the cerebellum’s role is less about encoding ‘where’ directly, and more about ensuring that cortical maps have the right temporal and dynamic grounding to infer ‘where’ themselves. My brain keeps wanting to compare it to how GPS works, but that’s probably a bad analogy.

@rmounir

It’s been a minute since I really dug into the research. Let me double check myself, and I’ll come back with more concrete definitions and an example for you.

Sorry for the delay on this. Having rewatched the video now, I think I may have over-focused on the ego-to-allo transform aspects discussed in the latter half of the meeting. Instead, it seems like you guys are more focused on the flash inference aspects.

To my understanding, the brain does this essentially through a feedforward sweep. E.g., during flash inference, each column:

  1. Receives its initial input via layer 4
  2. Rapidly intergrates it through fast-spiking interneurons / pyramidal cells
  3. Syncs the spike timing via gamma oscillations
  4. Shoots the results forwards through to the next columns up the hierachy (ex: LGN > V1 > V2 > IT > etc.)
  5. Thalamic inputs help ensure that all columns fire in a coherent manner so that forward sweep doesn’t become fragmented.

In my head, I’m kind of visualizing cortical columns as relay runners. Information is their baton. The sensory thalamus (e.g., LGN in vision) hands the baton (input) to layer 4, interneurons make sure all the runners are in step, then layers 2/3 carry it forward to the next column. The pulvinar and cerebellum make sure all the runners in different lanes begin their stride at the same time, such that the batons continue moving smoothly up along the track.

This is probably a terrible explanation, but its what I have :stuck_out_tongue:

@rmounir

To answer you more concretely, I’m defining when as ‘phase alignment’ and how as ‘forward models of sensory-motor transformations’.

For example, the cerebellum helps ensure that signals across different modalities arrive within the right temporal windows.This helps prevent the misbinding of features when those inputs occur milliseconds apart. This is the when function I was referring to.

As to the how: The cerebellum makes predictions about how sensory input should change over time (e.g., during saccades or when reaching for something). This prediction stabilizes perceptions such that columns don’t need to re-learn an object after every little shift in perception.

Tl;dr: the cerebellum knows the timing differences between different sensory inputs, and is able to forward-model potential future states.

To explain the where derivation, lets imagine you’re standing in the woods. You’re standing in the woods and a bird chirps. Your left ear receives that input some 200 microseconds earlier than your right ear. Already your brain knows that the bird is somewhere on your left side.

In response, you turn your head to the left (lets say by 10 degrees). Now, if the bird chirps again (providing it didn’t move), the interaural time difference should shrink. It should arrive to both ears at roughly the same time now. This is an expected transformation — a prediction.

We can then calculate the spatial location of an object (in our case, the bird) by comparing its predictive timing to its actual timing. For example, if the time difference of the chirp does in fact shrink, that confirms the bird’s location as being X degrees to your left at Y distance.

Another analogy might be like using GPS satellites. The timing difference between the signalling of two satellites gives you a very rough bearing. If you move a little, you can compare the timing shifts with expected geometry, and use that to triangulate a more precise fix. This becomes more accurate with additional satellites. The sensor agents of the human body are like these satellites.

Alrighty, I think thats everything. I do want to circle back and say think I made the mistake hyper-focusing on a very specific part of the video, which, while useful for system coherency and navigation, I don’t think really addresses the main point: flash inference. I don’t want to distract too much from your guys’ main target there.

Also, I wanted to echo @AgentRev: the cerebellum def isn’t directly generating where information, but I do think its contributing indirectly via thalamic relay and cerebello-cortical loops, more broadly.

Hopefully this addresses everything.

3 Likes

Thank you for the clarification and analogies @HumbleTraveller . I really appreciate the detailed breakdown. Here are some thoughts on the role of the cerebellum in coordinate transformations and flash inference.

On the “when” aspect, the idea of the cerebellum helping align signals across columns fits well with something Jeff has mentioned before. He has discussed the need for a synchronization signal that originates outside individual columns to coordinate the timing of sensory inputs. He suggested that these signals could come from the thalamus, particularly from matrix cells that project more diffusely across neighboring columns. It seems possible that cerebellar–cortical projections through the thalamus could help drive or modulate these matrix cells to maintain temporal alignment. In TBT, the relay cells in higher-order thalamic nuclei are thought to relay relative orientations of features between hierarchical regions. This suggests an interesting division of labor between temporal synchronization (through matrix cells) and spatial transformation (through relay cells). @vclay recently presented a few slides expanding on how thalamic matrix cells could synchronize learning and inference between columns. I don’t think this video is out yet but I think it will be published soon.

For the “how”, it helps to separate different sources of perceptual change and how a column handles them. When the object itself changes, we treat this as part of the object behavior model, learned as a temporal sequence of location-by-location feature changes. This information becomes part of the model (i.e., stored in the column) and is useful for many tasks, such as inferring objects from their behavior, planning and executing model-based actions, and making accurate sensory predictions. When the change results from sensor movement, the compensation likely occurs within the cortico-thalamic loop. The column infers the object’s orientation relative to the sensor and sends that information back through L6b to the thalamic relay cells, which adjust the movement-related input before it is reintegrated through L6a. This loop maintains the allocentric representation within the L6a reference frame by compensating for variations such as object orientation and scale. The cerebellum may contribute fine-grained timing and corrective adjustments, but it is probably not responsible for the larger coordinate transformations, which seem to be handled within the cortico-thalamic circuitry itself.

Clinical evidence seems to support this view. Patients with cerebellar lesions often show problems with motor coordination, such as dysmetria, intention tremor, or nystagmus, rather than a loss of perceptual inference and prediction. The cerebellum appears to integrate motor and sensory feedback to smooth and stabilize actions, rather than to generate the perceptual models that predict how sensory input should change. For example, damage to the vestibulocerebellum can cause drifting gaze with corrective saccades, and lesions in the spinocerebellum affect smooth limb coordination. Yet these patients can still maintain accurate sensory predictions about object behavior and dynamics. This suggests that the cerebellum mainly refines temporal precision and error correction on the motor output side, while perceptual prediction itself remains largely a cortical function.

2 Likes

Thank you for the follow up here. These are great points! The thalamic matrix and relay cells probably do much of the heavy coordination between columns, I agree. I also agree that the coordinate transforms likely occur within the cortico-thalamic loops.

Its an interesting thought though: if cerebellar input had modulated matrix-cell synchrony, couldn’t that provide the kind of temporal scaffolding I was describing, with the relay circuits themselves then handling the spatial transforms?

I would offer some pushback on this, as there does seem to be evidence for the contrary. Here are a few studies:

  • Schmahmann (2018) – A comprehensive review of cerebellar cognitive functioning. It’s more of a book than a paper, but Dr. Schmahmann is arguably one of the leading minds in cerebellar research. You should also look up ‘Schmahmann’s syndrome.’
  • Ivry & Keele (1989) – Classic empirical evidence of perceptual (non-motor) timing deficits. The main point of interest here is impaired temporal discrimination in tonal perceptions.
  • Sokolov et al. (2017) – A conceptual expansion of cerebellar predictive models extending beyond motor control. Basically, it suggests that cerebellar timing mechanisms influence things like linguistic comprehension and social cognition.

Though to be fair, many of these deficits seem to affect timing, prediction, and higher-order cognitive domains rather than low-level inference, but even still, they demonstrate cerebellar functioning extending well beyond pure motor coordination. And in the end, isn’t all cognition a kind of movement?

3 Likes

Yeah, that’s an interesting thought. It’s possible that the cerebellar–cortical loops contribute to temporal synchronization through the thalamic matrix cells. One distinction that might help clarify things is that the cerebellum may be more involved in synchronizing the timing of motor outputs between columns, rather than in synchronizing the timing of sensory or feature inputs.

The matrix cells seem well positioned for this kind of coordination. Their projections innervate layer 1 interneurons, which influence both the apical dendrites of L3 pyramidal neurons (involved in perception) and L5 neurons (involved in motor output). This makes it plausible that matrix-cell activity helps align timing across neighboring columns, as Jeff has previously suggested. The cerebellum could be modulating motor output timing through the matrix cell → L1 → L5 pathway, while a model-based signal from the columns coordinates sensory prediction and adjusts the tempo of a sequence depending on whether a sensory feature arrives earlier or later than expected. This is all speculative of course :slight_smile:

Re. the lesion evidence, those studies sound fascinating. I wasn’t aware of Schmahmann’s work in that scope, so I’ll definitely look into those references. Thanks for sharing them.

In practice, there’s no cognition without movement, since every cortical column has a motor output signal, and learning depends on active exploration to build structured representations. Still, at the circuit level, it might help to separate where different computations occur. Movement generation and control are primarily in L5 intrinsic bursting neurons, while perception and inference happen mostly in the superficial layers. So if cerebellar lesions mainly disrupt motor coordination, that might point to effects concentrated in L5-related computations and the cortico-cortical bidirectional association pathways in L5. But I agree that impairments in motor output would inevitably cascade into learning difficulties, since movement is part of all forms of learning and perception.

1 Like