2023/03 - Speedup Discussions - Part 2

The second part of the deep dive into performance improvements in Monty. Watch the team discuss alternatives to KDTreeSearch, improving matrix operation performance through selective hypothesis updates, leveraging memory management strategies, and Python multithreading limitations.

If you’d like to suggest ways to speed up any part of monty, we’re always open to suggestions! We’re probably not at the point where we’re going to implement in c/c++/rust/elang/mojo/ada/julia/etc… at this time. :slight_smile:

One useful general principle is ‘lazy evaluation’. This is the deferring or even avoiding of processing that doesn’t deliver any immediate benefit. The Haskell language is strong on this. Cybernetics should live above ‘busywork’ programming. And ‘voting’ should be able to return a ‘defer’ decision.

1 Like

@vclay have you considered using principle curvature or the mean Gaussian curvature (or the log of these) instead of pose vectors for filtering hypotheses?

Since these curvatures are rotation invariant, this would largely remove the need for the large number of rotations and matrix operations needed at the moment to identify matches.
From what I understood from the Monty code, the curvature is already derived at each point in the graph, but is only used to adjust nearest neighbour distances.

As a bonus of using the curvature, it may become practical to employ LSH with the curvatures as keys and so gain an additional performance speed up.

One open question is how robust to noise will this proposal be?

It is interesting to note that in the Somatosensory system there are neurons that select for curvature.

1 Like

Hi @fcred and @scidata thanks a lot for your suggestions! One thing we are doing that helps a lot with speed is to only update the evidence of the current most likely hypotheses. Also voting can be set up (and by default is set up) to only vote on most likely hypotheses. Both of those give significant speedups with little impact on performance and are kind of related to the lazy evaluation you mentioned (if I understand it right).

@fcred I am not sure if I am missing something in what you are proposing, but I just want to clarify that the pose vectors ARE the principal curvatures. So the 3 orthonormal pose vectors = [point normal, PC1, PC2] (tbp.monty/src/tbp/monty/frameworks/models/sensor_modules.py at 4fc72e766ce7d7ab4a9cbb81f2a5463829e117d5 · thousandbrainsproject/tbp.monty · GitHub). Since principal curvatures have 180-degree ambiguity (we don’t know which way it points) this gives us 2 possible poses for each location on the object. (see here for a bit more detail on this Evidence Based Learning Module ) So we don’t actually have a lot of rotations to test for each location but in the beginning we could be anywhere on the object and for each location we would have a different rotation hypothesis (if I would be on the top of a can, the rotation of the can would 180 degree flipped from if I would be sensing the bottom right now). As we move, we then quickly refine our hypotheses so that only a few locations and orientations of the object are possible, and we have to test a lot less. Does this make sense?

Best wishes,
Viviane

Did you guys ever end up entertaining Eric Collin’s idea regarding locality-based caching of the KD-Tree search?

(Found in Part 1 at the 22:30-second mark, here: https://youtu.be/JVz0Km98hLo?t=1353&si=5NmAMfvNmsL1KYUE)

My intuition says that he was on the right path in his thinking there. The team just decided to go with the that global lookup table idea (which also wasnt a terrible approach).

I think if you ended going with some method of leaf-node caching, however, you would essentially still be leveraging those lookup tables–with all their benefits and drawbacks–albiet stored at the local LM level.

Approached this way, you can almost view the issue of nearest neighbor sorting as a sort of DNS routing problem. The cached nodes would be like local DNS entries as found on a computer (if we were to view the LMs as computer nodes on a network, that is). Then if you needed to escape the local cache and perform a broader query/search you could step up the KD tree’s construction, similar to how a DNS resolver queries against an Authoritative nameserver > then Top-level > then Root.

I know some of the team had reservations against this given the somewhat stochastic nature of the sensory inputs, but this approach has been successfully used for applications similar to your own. Specifically, robotics.

As an example, here’s a paper on 6-dimensional localization and mapping techniques utilizing KD-tree local caching:

In it, the team used an approach similar to Collin’s suggestion (combined with an ICP algorithm) to map out x/y/z coordinates, plus pitch/roll/yaw, enabling spatial awareness for their system. The robot had a spatial field of 181 x 256 points (so ~46,000 points) in which it was able to analyze in about 3.5 seconds. It did this while being constrained to only 768MB of RAM and utilizing a Pentium-Centrino-1400. So… hardware from almost 20 years ago…

Plus, if we loop this all back to being inspired by the brain, this approach isn’t too far off from concepts like hippocampal indexing theory. Here, the cortical-cortical connections are akin to the local networking paths found between computer nodes (their addressing schema cached privately), and the hippocampal areas are like a DNS server resolving against the broader network.

Though in this case, the “hostname(s)” you’d query for would be the LMs’ input patterns/hypothesis/what-have-you, and the returned “IP addresses” would be those column’s ID. What are your thoughts?

1 Like

Hi @vclay, thanks for the reply, but I am sorry to point out that your pose vectors are NOT the principle curvatures. Rather the pose vectors are the normal vector and the two principle directions vectors. The principle directions point in the directions of the principle curvatures.

The principle curvatures are scalars and uniquely characterise a 2D surface independent of any coordinate system.

Hence the basis of my suggestion.

Rather than having to rotate and compare vectors to accept or reject hypotheses, you only need to accept hypotheses with similar scalar curvatures. This probably would be much faster. Likewise for the KDTree search. Similarly for voting.
As I mentioned earlier, LSH also may be practical here. But even if not, the replacing of vector calculations with scalar ones should speed up things.

The pose vectors would only be needed to generate paths for motor policies, and to finally establish the object’s orientation in the body’s reference frame once the object has been identified.

Or am I the one missing something?

Weird. I wonder if Vivian misspoke? (I’m not seeing where the PC is contained within the pose_vectors either). That said, isn’t it packaged alongside it, within morphological_features?

Oh, I assumed that you meant the principal curvature directions since you suggested to use them instead of the pose vectors. The whole point of the pose vectors is that they are not rotation invariant and can, therefore, inform the rotation hypotheses. We do use the principal curvature magnitudes as well (extracted in the same get_principal_curvatures function of the sensor module), but we use them as a feature (tbp.monty/src/tbp/monty/frameworks/models/sensor_modules.py at 4fc72e766ce7d7ab4a9cbb81f2a5463829e117d5 · thousandbrainsproject/tbp.monty · GitHub part of the non_morphological_features dictionary in the state class). Features are rotation invariant and the 2 PCs are treated the same way we treat color. So, we use it at every step to update our evidence for the different hypotheses we have. But if we don’t have the pose vectors (aka. point normal and PC directions), then we can’t have informed pose hypotheses. That means we would have to test way more than just 2 possible rotations at each location on the object.

Maybe a good way to frame this is that the fact that the pose vectors are not rotation invariant is deliberate. The main problem we are trying to solve is to figure out the object’s rotation to know how to rotate the sensed movement (displacements relative to the body) into the object’s reference frame. The pose vectors can help us narrow down the possible rotations very quickly, exactly because they are not rotation invariant. Does this make sense?

@HumbleTraveller I’ll respond to your suggestion too but I need a bit more time to understand it in detail and understand what the difference is to the KD Tree search we are doing. Thanks for writing this out!

1 Like

@vclay, I understand why you are using the pose. What I neglected to say was that in my suggestion the sensed movement provides the curvature at this position. This and the position in the object’s reference frame should be enough to refine valid hypotheses without having to determine the object’s rotation.

The poses in the object reference frames of the hypothesised points will be needed to generate displacements as per the motor policies, and only here will this need to be rotated to so that the motor displacement will be in the body’s reference frame.

I realise that this may sound like the same amount of rotations as your current scheme but since the rotations will only take place after a lot of hypotheses have been rejected due to inconsistent curvature, there should be a lot less rotations needed.

Using touch as an example, I think that the two schemes can be compared as follows:
In my scheme, the agent touches the object and thinks that is say either a pen or a cup. For each the agent uses its models and generates a displacement to determine which and then converts this displacement into the body’s reference frame in order to perform the displacement.
In the current scheme, if I understand correctly, the agent determines the possible poses and their relationship to its body and generates a displacement in its reference frame which it then performs.

When I close my eyes and use touch, I think that I do the former.

2 Likes

My stealth startup has been building a model that’s pretty damn good at compressing multiple datapoints (such as your x,y,z,whatever attributes,…) down into a single value in a way that’s very fast; I specifically had been struggling myself with trying to use kdtree and similar models and ran into all the issues around building and querying trees simultaneously, where adding anything new to a tree might kick off a total rebuild and all the associated allocations/deallocations, etc. (all the speed issues you’re running into headlong)

It’s under IP protection (took me thousands of hours to figure it out) and would need to be licensed, but it’s aching at my heart that it would solve these speed problems.

It specifically doesn’t care about the dimensionality of the inputs and you have the ability to decide what level of lossy-ness you want to tolerate. It also doesn’t require matrix math and happily works with sparsity.

1 Like

Hi @Max_Lee,

Thanks for the offer and that is exciting that you have solved many of these issues.

Part of our mission is to have this technology as widely adopted as possible and unfortunately, needing to license third party software would inhibit this.

For other people reading this thread - we’re doing as much as possible to make this software available to all of humanity:

  • Numenta has a non-assert pledge on all the TBP related patents,
  • We have chosen the MIT license for the Monty codebase.
  • We have a Contributor License Agreement (CLA) signing as part of our GitHub PR process.
  • We’re working as close to 100% in public as possible. :slight_smile:
2 Likes