Point clouds for potential movements and maybe potential distortions too?

Since the brain is a one trick pony, behaviors have to be objects that are located in space. Distortions probably are too.

Density of the point cloud would indicate probability of motion in that direction for all the detected locations on the object, the motions required to achieve a goal in that direction, and possibly potential distortions. Probably other things as well. To throw a ball seems like it requires unconsciously calculating the arc it will take, and I’ve heard someone claim they could bat so well because they did the math in their head every time.:woman_shrugging: “Plausible explanation”. But the most likely motions that cause the ball to move in a way that leads it to a target have already been learned and can quickly be applied. We would play catch like a toddler for our entire life otherwise. Essentially every sampled location would have an associated density map of motions that are possible and a map/graph of the motions to make those happen. Those are each combined and joined to the model of the object at the location sensed. Movements are commands to the motor system to perform patterns of movement it has learned, and not a step by step. No time for that.

Prediction is absolutely required for movement, and one system to predict an object by its shape and another to predict where it will move next, along with other behaviors like color changes. Differences in cup color as distortions of a global model of a cup seems more likely to be handled exactly like the cup, because Nature has very small R&D budget.

1 Like

With a suitable reference frame (e.g., camera orientation, position, range, settings) and some calculations, an RGBD “image” could be turned into something like a “colored point cloud”. I’m not at all sure this is the most efficient way to handle 3D pixels, but I wonder how this might fit into your approach.

In any event, I just asked chatGPT to discuss Monty’s use of RGBD cameras:

There is some discussion of the use of an RGBD camera in Monty, but I’m not clear on what technology (e.g., stereo vision, IR point cloud) this would (or should) use. Discuss.

It said:

In the Monty context, “RGB-D camera” is really shorthand for a sensor that can deliver a spatially structured observation stream—not a commitment to any single physical technology. The important question isn’t which sensor, but what kind of information Monty actually needs, and how that information should be represented and consumed.

There is no single “correct” RGB-D technology for Monty.

  • Stereo vision aligns best with biological plausibility and learning-first design.
  • Active depth sensing is useful for engineering pragmatism and bootstrapping.
  • The key design choice is not the sensor, but whether depth is treated as given or inferred.

If you’re thinking in Monty terms, the most faithful answer is:

Monty shouldn’t depend on RGB-D; it should learn to construct depth from whatever structured sensory evidence it can obtain.

I then asked (in line with my original question):

With a suitable reference frame (e.g., camera orientation, position, range, settings) and some calculations, an RGBD “image” could be turned into something like a “colored point cloud”. I’m not at all sure this is the most efficient way to handle 3D pixels, but I wonder how this might fit into your approach.

and got:

Yes — what you’re describing is exactly the right mental pivot, and it gets to the heart of why “RGB-D” is more of an intermediate artifact than an end representation.

  • Yes, RGB-D → colored point cloud is straightforward and often useful.
  • But point clouds are not an efficient or biologically plausible long-term representation.
  • In a Monty-style system, point clouds are a transient computational tool, not the core representation.
  • The real goal is extracting stable spatial structure and predictive relationships, not storing geometry.

If you want, we can go next into:

  • How to represent “surface fragments” instead of points
  • How Monty might use egomotion to eliminate the need for dense depth
  • Or how this compares to NeRFs, implicit fields, and SLAM approaches

Just say which direction you’d like to go.

1 Like

I then asked chatGPT to examine the current discussion thread:

Can you inspect and react to a Monty forum thread (e.g., https://thousandbrains.discourse.group/t/point-clouds-for-potential-movements-and-maybe-potential-distortions-too)?

Turns out, it could and did:

I opened and reviewed the Monty/Thousand Brains Project forum thread you linked: “Point clouds for potential movements and maybe potential distortions too?” — here’s a clear summary and reaction to it: …

(ducks)

1 Like

in the brain at the low level it’s, “what the most likely location for this point when we check next?” there’s no geometry.

Agreed. Then again, anything involving poses, reference frames, and such would seem to have some geometry (at least implicitly) involved. How the brain handles this seems to be a bit of an open question…

I’ve wondered a bit about the amount of math that the LMs currently do. Perhaps it reflects the current attempt to model an entire cortical column (with scads of neurons and such) in a single process. If so, this might change as Monty’s design and implementation evolve.

1 Like

Brains process space, so geometry has to be there, or rather, geometry emerges from processing space in a much different way than expect. Musicians have a problem where the best performers are terrible teachers. It’s ironically because they don’t know what they’re doing. It comes so easily to them that they can’t pass it on to someone else. Thinking works like that. It comes so easily to us that we have no idea what’s going on. The entire field of Philosophy uses language to describe what it feels like to be conscious and trying to make sense of things. But it can’t tell us how that actually happens, or even where to look. The “hard problem of consciousness” is getting over how special it feels to have it. We feel complex, and we assume there has to be complexity behind that feeling. Then we look at the brain, at it’s a mess, and not gloriously complex, just gooey and repetitive, so consciousness had to be somewhere else.

But, “There is no ghost, only machine”. All those mistaken ideas we’ve had over thousands of years are still going strong and where we direct our attention and the ideas we get by extending those ideas are influenced by that cultural burden without us realizing it. Complex objects are built out of simple objects and we have very few good simple objects to pass on to others that they can use to build their own model and associate it into their world model. What we are doing here is something that hasn’t had to be done at this level for a long time. We are trying to imagine the unimaginable, and create bite size pieces we can use to pass this on to other people so they can reconstruct the complex model and imagine it with us. It’s how Physics does it. It takes years to learn the languages needed to understand and apply Physics - Calculus is the main one. It’s how the brain does it. We don’t have the language for this model yet, and we’re forced to use a language that describes our mistakes to get started, because that’s what we have to work with. Psychology wasn’t a real science until the 90s. That’s the hold our fantasies about how minds and brains work has on the language we use as a culture. Culture is spread and maintained through language. It’s a difficult problem. When the brain lacks knowledge it falls back on intuition, and reality is non-intuitive. If the knowledge we feel we have is garbage and random trivia and we don’t know it, our intuition is catastrophically bad - that’s D&K in a nutshell and why no one is immune to it. That’s why people who are considered “AI literate” are experiencing the worst of the psychological effects. The model they have is a fantasy given to them “on good authority” and based on the cultural ideas we have about minds, etc. Actions taken because of fantasies about how reality works are always disastrous. We’re not yet building robots, we’re building the knowledge needed for talking about building robots, while clearing out a bunch of garbage we don’t need. Psychology is still struggling with that second part. Their vocabulary is still full of “transference” and other mystical nonsense.

I miss rhyolight (Matt). We used to have these sorts of discussions on Twitter when it was still awesome. The strongly held cultural belief that the brain is too mysterious to possibly understand alone holds back real research. The belief it is too complex to understand causes us to overlook the beauty of the minimalist design. Those types of ideas about reality held back Physics for centuries. Consciousness feels to us to be complete, continuous, and accurate. Imagining consciousness as focusing on part of the world while assuming the rest will be where you predicted it would be when you check while skipping around from place to place is difficult when that’s not what you’re experiencing. Now add on top that it’s happening in different modalities simultaneously, and they’re not all paying attention to the same things at the same time. For drummers, it gets wild, but it feels smoothly continuous, and funky. But that all comes from the neurons in the structure that columns are built from doing their “this one simple trick that will drive AI researchers insane”. It’s hard to imagine that the scent of perfume in the air and how aroused it makes you feel just to smell it is based on what Jeff has discovered (led there by many people before him. “If I have not seen further, it’s because giants were standing on me”), and nothing else.

We’re not trying to build that. We’re abstracting the process at a higher level, but not too high - the aerodynamics of jets and birds are not remotely the same, and the pilot points the way and computers do the flying in modern jets, without that they are uncontrollable at speed - we are emulating the level of the cortical column in the learning modules. That’s why the illusion of geometry. We’re working with and viewing the process at an abstraction level we can understand. In an actual column there’s millions of neurons. That’s not something we need to model for this. Someone else can do that for fun - and you know someone will. It helps to keep it in mind that the brain only has one hit song, but it sold enough copies that it’s set for life now, baby, and that trick isn’t the cortical column. That particular trick is a more complex arrangement of the one trick that makes it all possible. The neuron. In crawdads, there’s a “giant motor neuron” that fires on reflex and it contracts the tail as strongly as possible. It has to be large because it sends a signal to all the nerves in the muscles at once, and it has a recharge time, so it’s like hitting boost. One quick burst and back to normal speed. The frequencies bats use for echolocation are too high to be processed at normal speed and their brain buffers the input to slow it down, creating a minimum length of time between pings. We know they actually run into things, and especially each other all the time now that we have night vision trail cams. And it’s all done with one simple trick.

When I read the HTM paper it was like a door opened. Up until then the only response I had to anything AI related was, “nope, that’s not it.” Like Jeff, I knew that the abstraction level of the current methods was so high that it no longer resembled the original. Neurons have thousands of connections was my complaint as well. Simplicity doesn’t lead to complexity so easily that point neurons could lead to complexity. DNA plus water, proteins, sources for materials to make proteins, and power - chemical, heat, solar - amazingly leads to living things, it’s not a simple process, but it’s made from simple things, repeated over and over into more complex things. All to create more DNA and proteins. Not because of “selfish genes”. It’s just how the process works and without the process it’s just H2O and Carbon in the sand. If it didn’t work, we wouldn’t be having this conversation. People have purpose. DNA has no purpose in life. No “Hero’s journey of Evolution.” The kinds of analogies like Dawkins used are not helpful in educating the public, or in approaching the science. They confuse people because they aren’t true. Anthropomorphizing a crystalline structure we can’t possibly see is weird, and in this case it leads directly to “Evolutionary Psychology”. People are still stuck on the literal ‘I’m looking at something” meaning of “observation” when talking about Physics. The framing leads to conclusions. I joke that philosophy is “unvalidated ludicrous premise + well reasoned argument = eternal truth.” But that’s actually just the way culture works. Philosophy only talks about what the results feel like. There are no qualia. All the pre-existing ideas we have need to be cleared away before they can be replaced, there’s no updating them. We are starting over.

Nothing is what it seems. Matter is empty space, fields, and particles. Who knew? Brains are nothing like we’ve imagined. Jeff is doing his best to translate the internal model he has built using using a vocabulary built for other purposes, and it isn’t very adaptable because of the potential for confusion. Like the need for using MI instead of AI. “AI” carries baggage now that it cannot shed. I’m an artist, and framing has forever been what you do with paintings. Other than that, it was vague, and it turns out it was actually not a bad analogy, except I’m holding the picture frame in my hands, and there’s little frames around everything in the view I point it at, and they’re moving in various ways. That’s a salvageable analogy. And it can be extended in both directions of complexity without losing “definition”. It popped into my head looking at the graphics of frames in space. Frames wasn’t carrying any baggage that would lead to a different conclusion. I “knew it when I saw it” and I could form an association between the two. It’s harder to describe the image I have of behaviors being the exact same thing as the object they are associated with. They are produced simultaneously with the physical object and super positioned/imposed on the space the object occupies by the same compositional methods that are producing the cup from a cylinder and a handle from the simple information they’re getting. How that is done and how we can emulate/abstract that is an open question, but the only thing the brain has to work with is a network of neurons that navigates space. It’s humbling. We thought were unique and special. Dogs, bears and animals that rely on scent have a 3D map of the world by aroma that is aligned with the rest of the world. Reference frames in space all the way down.

“one trick pony” “one hit wonder” “once is all it takes” “one for the Gipper” “this one simple trick” “there’s only one way to rock, if you want to rock and roll”

We don’t have to do it the way the brain does it, but we need to understand how the brain does it, because nature already solved the problem more than once well enough to give us the ability to look into how it did it, and what we’ve built instead are nuclear powered nudie generators. That is seriously not intelligent. Men really should not be running things. Instead of household robots we get bartenders with boobs.

Yes, I’m long winded enough that I don’t need chatGPT :joy:

2 Likes