Breakthrough Ideas for Modeling Object Behaviors

As you may have seen in some of our recent brainstorming sessions, our team has spent the past months talking a lot about object behaviors, how they could be modeled, recognized, and all the complexities around this. In February, we had a breakthrough idea! We spent the next weeks figuring out whether the idea holds up from a neuroscience and implementation feasibility standpoint (it seems it does!) and further questions around it. The result is outlined in this document.

This post and the linked materials (including meeting recordings, documents, and diagrams) are intended to establish priority on the idea. Our goal is to establish prior art to prevent others from obtaining patents on these concepts, ensuring they remain freely available for use, research, and development by the broader community. To make it more accessible, this is written in plain English rather than patent-lingo, but it may still contain some more formalities. We will work hard on putting together easy-to-follow presentations and write-ups in the coming months. You can also find this write-up in our documentation.

If you are interested in in-depth discussions of the ideas presented here, we have published a series of meeting recordings on YouTube. Those are from our research meetings over the past months during which we conceived of the idea, formalized the general mechanism, and discussed its implementation in the brain and in Monty.

You can find the whole Playlist here: https://www.youtube.com/playlist?list=PLXpTU6oIscrn_v8pVxwJKnfKPpKSMEUvU

Over the next weeks, we will add more videos to this playlist as we continue to explore the remaining open questions. For now, you can find the following videos:

  • Brainstorming on Modeling Object Behaviors and Representations in Minicolumns https://youtu.be/TzP53N2LsRs - The first meeting after we had our breakthrough idea outline in this document (using the same mechanism for behavior models as for object models, but storing changes instead of features). Unfortunately, we don’t have a recording of the light-bulb :light_bulb:moment (which was quite exciting for everyone present!) itself as it happened on the last day of an intense in-person brainstorming week.

  • Review of the Cortical Circuit for Object Behavior Models https://youtu.be/Dt4hT4FxQls - A long follow-on meeting the next day where keep brainstorming about remaining open issues

  • Behavior Models Review / Open Questions Around Behavior https://youtu.be/LZmEgcTsgUU - We review the mechanism we propose for modeling object behaviors and how it could map onto cortical anatomy. We then go through our list of open questions and discuss some further ideas around them.

  • A Solution for Behavior in the Cortex https://youtu.be/BCXL2Ir_qh4 - I present some new diagrams illustrating our theory and implementation so far, how the new ideas would extend them to model object behaviors, and how remaining questions could be solved. I start out with an overview of the problem space and then present solutions to each of the constraints we formulated. This is a good summary video to start with.

  • Behavior Models in Monty & Potential Solutions to Open Questions https://youtu.be/LocV1X0WH2E - I present a more conceptual view of our proposed solution and how it would map to our implementation in Monty. I then suggest a potential solution to a big open question that remained at the end of the previous meeting (communicating location changes to make correct predictions about object morphology).

To get a big picture overview of where we are today, start with the last two videos. If you would like to follow along our journey and be a fly on the wall of how we got to this point, you can start at the beginning of the playlist, which is sorted chronologically.

As always, let us know if you have any thoughts or questions!

  • Viviane

Object Behavior Modeling - Disclosure.pdf (606.5 KB)

10 Likes

I’m only 40 minutes into the first video, but what if each minicolumn could encode a deformable 3D object for example a sphere. The cofffee mug is a punched in sphere. The stapler is a sphere stretched with sharp corners. A face is one side of a deformed sphere. Topology changes like a donut are - something. Once encoded, inputs select different orientations. The outputs are themselves oriented 3D models that can be placed in the current map of 3D space. The object can be rotated by scanning through an input layer. Translation can be an afterthought, and maybe scaling too.

1 Like

Hi @Don_Elbert! I don’t have any answers but was curious what such a encoding might look like compared to point cloud? Would it be storing more of a surface or volumetric information rather than point clouds? I’m imagining a scenario where we store some basis topology, e.g. mug would be a combination of genus = 0 (sphere, for the cup part) and genus = 1 (torus / donut, for the handle), and some transformation function that “molds” the sphere-donut into mug. :sweat_smile: I have no idea if this is what you meant, though…also, what would be the benefit of storing information in this manner?

(And welcome to the community!)

I was thinking, when Jeff was talking about how lines define the stapler, how hard it is in the CAD world to store objects as spline curves and meshes. If hinge behavior is just displacements, I think it’s worth considering that the object itself might also be encoded as displacements from some Euclidean ideal. Then the hinge behavior is adding a displacement to a displacement. There could be a unity of architecture with a decoder somewhere else. Anyway, I guess I’m just describing some form of feature based representation, and perhaps I misinterpreted how objects are stored. They could already be high dimensional representations constructed from the curves observed by the visual system but not the curves themselves. Although applying displacements in that space would be hard. Interesting questions, and I think many would say it’s not point clouds, but who knows.

Hi @Don_Elbert

I’m not 100% sure I understand what you are proposing, but maybe you find this interesting:

  • Using displacements to represent objects was actually one of the first approaches we tested. You can still find the DisplacementGraphLM in our code base.
  • We have a detailed write-up of the benefits and drawbacks of this approach (compared to the current approach of storing locations) here: Unsupported browser
  • The TLDR is that if you define the object as a set of displacements, you can’t generalize to novel paths on an object. You will always have to move your sensor along the same displacements that you observed during learning. Even though this approach has some other nice benefits (see the writeup for details), this is a very limiting assumption, so we abandoned the approach. We instead moved on to storing features at locations (like it could be implemented with grid and place cells in the brain and how it is implemented as point clouds in our current code).
  • Defining the behavior model as a sequence of displacement does not come with the same issue because here we have a fixed sequence. Order matters and we don’t need to recognize the behavior while moving through the states randomly.

Let me know if I should go into more detail on any of this or if I totally missed what you suggested.

  • Viviane
1 Like