2025/01 - Depth Perception from Parallax

@nleadholm presents a broad summary of the connection between parallax, binocular disparity, and depth perception, including how disparity for fine-depth is combined with sensorimotor cues of coarse depth. Hojae gives a more detailed description of Structure from Motion, a machine vision algorithm for extracting depth from multiple images.

2 Likes

Just as a point of clarification. In the photogrammetry examples drawn from flikr images, the gps data was too coarse to precisely determine the location of each camera. That data may have been used to gather all images near the monument, and as an initial guess for their location, but the precise placement of the cameras was inferred from the photogrammetry algorithm itself.

Once unique features have been identified in each image, the angular separation between them can be used to very precisely determine the viewing angle (similar to the process described by Niel for the fine tuning of depth via parallax but in three dimensions).

3 Likes

I haven’t heard much discussion about inferring depth from focus. One cool feature is that I only need one vision sensor for this.

I can cover an eye and alter focus to a variety of different targets and use the information about what’s in focus to infer the distance. It might not give me millimeter precision, but I think that 0.5m from my eye, I could probably get ~cm precision. Clearly, this deteriorates as the distance goes to 5, 10, 20 meters, where a much wider range of things will stay in focus. But there is a wide range of important cases where the focus can help identify depth.

Just the perceived visual input tells me whether something is focused or not (as opposed to returning relative distances of every “pixel” in the single observation). Combine that with my own knowledge of how I have altered the shape of my cornea, and I can get an absolute depth inference of the parts of the visual input that are in focus. Then, as my eye moves around a scene and refocuses, I can establish and remember the different distances of persistent objects in the scene around me.

Does this fit with any known neuroscience research, or am I extrapolating too much from my subjective experience? :laughing:

2 Likes

Yes :100: agree @carver ! I think something like this could be really interesting, the discussion got a little off track but this is along the lines of what I was trying to cover around minutes 40:00-48:00 above, i.e. the power of an iterative loop between choosing how to act in the world (focus to a particular depth), and how that impacts the sensory observation. It’s a nice pairing because it fits so well with Monty as a sensorimotor system, and like you say it could work with one eye. I don’t remember how much there is on the literature on this (a lot of the literature focuses on disparity), but clearly the brain can use accommodation as a signal for depth to some degree.

Unsurprisingly it would need to be integrated with other signals. For example, my understanding is that VR headset lenses essentially project the images ~2m in front of your eyes, but then other cues like parallax are used to give the impression of objects that are much closer or farther away. At all times however I believe your ocular lens is actually focused at that 2m point. So in this case, it isn’t giving a great perceptual signal but from a quick look online, this “vergence–accommodation conflict (VAC)” is actually a significant cause of eye strain when wearing the headsets. This supports that your eye does make use of accommodation, even if it is not a dominant signal.

2 Likes