@nleadholm presents Monty’s action policies, design, types, and applications. He explores utility policies for initializing agents, input-driven reflexive policies, and learning-module-driven policies for model-based actions. He also highlights future work targeting modular multi-agent systems, improving exploration efficiency, and advancing hierarchical learning to manage compositional objects effectively.
Slowely working my way through these vids.
In regards to Jeff’s comment found at the 37-minute mark, it could be that those extra V1 layers are performing a kind of binocular disparity compensation–gauging depth (as he had suggested).
I was recently reading an interesting dissertation on Reference Frames in Human Sensory, Motor and Cognitive Processing and they had actually modelled this into their system to assist with visually-guided reaching. (You can find the section somewhere around page 80 - 81).
But anyways, the relevant bits are that they managed to do this by coordinating stereopair mappings (one retinotopic mapping from each “eye”), perform that binocular-disparity compensation, then generate luminance profiling from those maps. They called this new, combined reference frame a “cyclopean map.”
They then used this to represent target and end effector positioning (similar to your body RF → model RF transforms, I think), and drive motor function towards their target. Not sure if this will prove useful to you guys at all, but it seemed relevant and, if nothing else, was interesting.