Jeff provides the definition of terms: Pose, Body, Sensor, Feature, Module (now called Sensor Module), Objects.
He explains how voting explains one-shot Object recognition as well as considering Modules are arranged in a hierarchy.
Additional discussion focuses on open questions of understanding hierarchy, motor behavior, “stretchy” graph, states, models in “where” columns, and feature discrepancies.
Great question! These terms were a bit fuzzy when we were first thinking about them. Currently, we don’t use the term Monty module. Its always either sensor module or learning module. The sensor module turns raw sensory input into the Cortical Messaging Protocol (CMP), the learning module models the incoming (CMP compliant) data. We are also thinking about adding a third type of module: motor modules which turn CMP compliant goal states into actions that specific actuators understand.
Hi @brainwaves,
is the pdf document from the numenta website still up to date with all the content in the videos released or where there some major or minor changes? And another question, do you know which code language the open source project will be? Thank you a lot
But I think, the core modules LMs, SMs should be implemented in C++ for better acceleration.
Do you @brainwaves think the current Python implementation is also enough fast?
At the moment our implementation is in the feature-build-out/exploration phase. We’re investigating which approaches work while also aligning with the principles of the Thousand Brains Theory. For that reason, we’ve chosen an expressive language with the kind of core libraries that Python offers.
That said, we do care about many iterations training takes, how many iterations inference takes, and how much data the system requires to function. We have a set of benchmarks that ensure that we’re always improving and no functional modifications to the code negatively effect our benchmarks. These benchmarks are published in our documentation so you can check those out once our code is live. Benchmark Experiments
Another aspect of performance is that, compared to deep learning, Monty requires orders of magnitude less data/memory/CPU to function. So while performance will be critical at some point, we don’t think it will be as all consuming as the race to make deep learning systems performant.
Lastly, I’d say that because Monty is a modular system comprised of Sensor Modules, Learning Modules, and Motor Modules, we can selectively decide to improve the performance of any part of the system. If we want to rewrite a learning module in Mojo or C, that would be simple, assuming that learning module uses the CMP to communicate.
Eventually, we think these modules should be rewritten at the hardware level so you can have chips that operate as fast as the state of the art in microprocessors.
@brainwaves understood, thanks.
I have just looked at the current version of Monty and it has almost no relation to any module of HTM concept like Encoder, SP, TM, GridCells based Location Modules.
Why does not Numenta use HTM in TBP?
The reason we are not using Hierarchical Temporal Memory (HTM) or the Spatial Pooler (SP) is that to start, we wanted to have very explicit and easy-to-visualize / debug representations so we can figure out the overall structure and messaging protocol of this new framework. However, the ideas are not contradictory at all, and we definitely imagine having HTM + grid-cell-based learning modules (LM) and incorporating the spatial pooler. The system is designed so that each component can easily be customized as long as it adheres to the cortical messaging protocol (CMP) we defined, so even today, you can get started implementing an HTM-based LM.
We still have many other research questions to work out where more explicit graph representations are useful, so we are sticking with this LM version for now.
I hope when you wish to improve Monty’s performance, you give consideration to Julia ( Python To Julia For Data Scientists, thereby maintaining Python’s expressiveness, but gaining C.s performance (& all Python’s libraries are easily callable from Julia)
Great video. So much good information. Thank you for posting these!
At about 10:25 Jeff says “the brain is perhaps closer to polar coordinates but using cellular codes”. Is there any paper or article you could point to where this kind of coordinate processing is discussed?
At about 21:20 he mentions it again I think, and also at 47:50, if these are all the same research he’s discussing. This sounds really intriguing.
Hi @Falco, thank you for the question! How the brain uses different coordinate systems is indeed a fascinating topic. While we most often talk about grid cells (video, review paper), which essentially las out a Cartesian coordinate system, here are a couple of items related to the brain and the Thousand Brains Project:
In addition to place cells, grid cells, etc., the brain has “vector cells” that encode the distance and direction between an animal and something in the environment, such as an object, a boundary, or a goal-relevant target (distance + direction = polar coordinates). Interestingly, vector cells can be allocentric or egocentric, depending on where you find them. For example, object vector cells in the medial entorhinal cortex (MEC) encode the distance and direction to landmarks regardless of the animal’s orientation. In contrast, vector cells in the lateral entorhinal cortex (LEC) encode the distance and direction relative to the animal’s current orientation/head direction.
In the video you referenced, polar coordinates came up in the context of pose. Recall that an object’s pose is defined not only by its location but also its orientation. An object’s location can in principle be represented in either polar or Cartesian coordinates, but defining its orientation is a matter of rotational coordinates. To my knowledge, the neural representation of an object’s rotations is currently unknown, but we can speculate that the mechanisms behind it share commonalities with how spatial problems have been solved elsewhere. For example, it is interesting to think about how animals encode the orientation of their head/body, from flies to bats, as this is relatively well understood. Perhaps something similar is happening when encoding object orientations.
There is a lot of recent literature on vector representations in the brain and how they integrate with the other spatial systems, but hopefully this gives you a direction to look into. Thanks again for your interest, and welcome to the community!
Various programming languages (e.g., C, C++, Elixir, Forth, Julia, Mojo) have been suggested as alternatives to Python for “production” Monty implementations. So, it seems plausible that all or part of the code base may migrate over time. That being the case, I’d like to suggest that Monty’s code be kept relatively free of strongly object-oriented features such as inheritance.
IMHO, well-written functional (or at least procedural) code is easier to port and understand than the equivalent OO code. Of course, proponents of stack-based languages such as FORTH will have their own perspective…