New Tutorials on Using Monty in Custom Applications

vclay · April 28, 2025, 6:26pm

Hi everyone!
This is surprisingly my first post here So far I’ve only been writing responses so I am not used to making up a headline.

Anyways, the news is that I just added two new tutorials to our documentation! We classified them as advanced tutorials as they dive a bit deeper into how you may actually modify Monty instead of just providing a walk-through of the current code. I wrote them in preparation for our team’s internal robot hackathon, but also because in several posts here, questions came up around using Monty in different environments and applications.

The tutorials walk through what you would want to consider and which code you would need to write if you want to use Monty in your own environment. It gives several examples of environments that we have implemented in the past, as well as code to follow along. I split it into two parts, one focused on the general idea of customizing the environment and a shorter one specifically on how this would be done for robotics applications. I would recommend reading them in order, as the second one builds on the first. Here they are:

I hope this helps people think about the possibilities of Monty and come up with interesting and unique ways to utilize and test Monty! If you haven’t seen it yet and want more inspiration, I also recently gave a presentation on future applications that would have a positive impact https://www.youtube.com/watch?v=Iap_sq1_BzE

Let me know if you have any questions!

Viviane

srgg6701 · June 13, 2025, 10:22am

Hi @vclay
Great post!
Unfortunately I don’t know Monty in detail yet (but I keep learning), so I would like to ask about the possibility of a concrete implementation of the robotics application.
Here is the description:

we have a 3D environment with several objects.
an agent (“robot”) appears in this environment
the viewer sees how the agent’s memory starts to fill with representations of these objects
the agent can move in this environment and recognize objects if their poses have changed and even if they have become distorted (while preserving their topology)
the agent can recognize “live”, i.e. – reactive objects. When it approaches them, they “run away” from it. The agent classifies such objects accordingly in its model and creates a kind of “no-approach zone” around them, which it should not violate.
As far as I understand, all this is quite possible. But it would be great to get your confirmation. I would be extremely grateful!

brainwaves · June 15, 2025, 2:26pm

Hi @srgg6701,

Yes, this should all be possible with the exception of object distortions - this is an active research area right now. Once Monty supports it you should be able to update your Monty version and have a system that then deals with distortions.

Currently, there are no facilities in Monty to directly control the location of a robot. Our motor system only moves sensors around so you would have to write the module that tracks the location of the robot and updates the location of the sensors on the robot accordingly.

We’d suggest that this is all built in simulation to begin with so you can build and test without the added complexities of robotics.

To your points

we have a 3D environment with several objects - an agent (“robot”) appears in this environment
- you can use HabitatSim.
the viewer sees how the agent’s memory starts to fill with representations of these objects
- there are visualization options for what the learning module has stored. You may need to write custom code to visualize all of the objects in a learning module at once.
the agent can move in this environment and recognize objects if their poses have changed and even if they have become distorted (while preserving their topology)
- Yes, but not distortion yet , and, as mentioned, your code will have to manage the location of the robot.
the agent can recognize “live”, i.e. – reactive objects. When it approaches them, they “run away” from it. The agent classifies such objects accordingly in its model and creates a kind of “no-approach zone” around them, which it should not violate.
- I think I understand this, but apologies if this is incorrect - As long as the sensors give you depth information (i.e. a camera with a depth sensor) then the resulting object detection will inherently give you distance. Currently, you would have to track the changes in distance of the object to understand if it was coming towards you and adjust the robots position accordingly. Your code will also have to manage the zones the robot is not allowed into.

Hope this helps, and good luck!

srgg6701 · June 16, 2025, 3:10pm

Hi, @brainwaves!
Yes, that was indeed helpful. If you don’t mind, I’d like to clarify a few more really important questions.
1. Timeline for implementing the distorted object identification feature.
I’ve checked your roadmap and watched two videos where your plans were discussed (2025/04 - Q2 Roadmap and 2025/04 - TBP Future Applications & Positive Impacts). Is my understanding correct that the implementation of this feature was planned for Q2? Here: https://youtu.be/Iap_sq1_BzE?t=568 Viviane mentions the problem of deformed objects, which, according to the video, is part of the compositional objects topic, and work on this was, in turn, planned for Q2.

The current timeline is somewhat different. As far as I can tell, the implementation is now planned to be completed by August of this year.

If I’m not mistaken, the situation should be relatively clear – whether the team will meet this deadline or if it will be extended.
And let me explain why I am so concerned about your timeline (and other things as well). The fact is, I greatly value what Jeff and your entire team are doing. I’ve actually been following this project for over twenty years (though I still find it hard to believe that twenty years have passed). I was very worried about whether Jeff would succeed in achieving what he has dedicated his life to. After reading your latest publication, I am practically certain that everything will work out. In this regard, I had an idea to organize a small team (most likely three people) to try and implement the Monty approach in a specific application. I have already roughly described what it will be, but I have a more detailed presentation that I would be happy to share if you are interested.
However, to create something, one needs to be sure what exactly can be realistically demonstrated to an interested audience. What I described is, in my opinion, the necessary minimum for an agent that we can call “truly intelligent.” This is very important, as the essence of the entire project is to demonstrate the conceptual advantages of sensorimotor AI based on Monty over current ML approaches. I am absolutely convinced that such a demonstration will be an important step in shifting the entire AI paradigm in the context of its potential for AGI-level implementation.
2. Status of Monty tools.
Viviane mentioned plans to create a Monty platform. This puzzled me somewhat, and it’s probably a terminology issue. I had assumed that TBP Monty could already be considered a platform since you provide source code, API descriptions, and so on.

I can assume that this refers to the creation of user interface tools that allow for the implementation of various functions and components, something like AWS, Azure, etc. Is that correct?
3. Article on DMC.
Viviane mentioned a certain article on DMC several times, but I can’t find it anywhere. I would be extremely grateful if you could provide its title or a link.
Thank you in advance!

brainwaves · June 16, 2025, 3:34pm

Hi @srgg6701!

@vclay is going to get back to you on 1 and 2.

I’ll take question 3 though - the DMC (Demonstrate Monty Capabilities) paper, is not yet out, so that’s why you can’t find it. It’s imminent, and very exciting, stay tuned!

vclay · June 17, 2025, 8:49am

Hi @srgg6701

Thanks for writing and having such a detailed look at our roadmap and project! I am excited to hear that you are considering putting together a small team to use Monty in an application. I guess one general note to make before getting to your question is that Monty does not yet implement all the aspects of the Thousand Brains Theory, and we still have open questions on the theory side (including how to model object distortions, behaviors, and how to use models to manipulate the world). This means that if you want to start building an application today, you will have to limit it to tasks that don’t require those capabilities or help contribute to the code base (which would also be awesome!). For some examples, we will publish some projects we implemented during a recent team-internal robot hackathon soon (using Monty with LEGO+raspberryPi, drone, and Ultrasound).

To your questions:

In the slide you are referring to, I had marked ‘(+Distortions)’ in blue. The blue text in that presentation was meant to indicate capabilities that aren’t directly covered by those larger milestones, but that we hope to come out naturally from the solutions we come up with. I’m sorry to say that we currently don’t have a finalized plan on how Monty will model object distortions (or how the brain does it), but we have been talking about it a lot in our latest research meetings, so hopefully we will come up with a good solution soon.
The “compositional objects” milestone on our research roadmap is not meant to include modeling object distortions. However, given our theory of how hierarchy works in the neocortex, we think that learning compositional objects will already cover a lot of cases of distorted objects (potentially all, but we need to test this). I’m not sure how interested you are in this, but in a nutshell, the idea is that the parent object will store the orientation and scale of the child object on a location-by-location basis. If there is a logo on a cup, we don’t just store one location of where the logo is on the cup. The logo-cup model in the parent column will store the logo feature and its pose at many locations. The logo could bend in the middle, and we would just store a different location and orientation for the logo on the parent model for those parts of the logo.
In a few months, we will hopefully have a better idea of how well our hierarchy solution works for all kinds of object distortions and whether we need another mechanism (related to object behaviors that distort an object).
In regard to the compositional models milestone, we will likely need at least until the end of Q3 to have a fully tested and integrated version of what we plan to add to Monty for this. The DMC paper you have heard so much about has taken more of our time than anticipated (but it was well worth it, as you will hopefully see soon!). However, there is already basic support for hierarchy in Monty (you can stack LMs) so if you want to start playing around with that feature you could. If you can give some more details on your intended application, I could also maybe give you my thoughts on whether you could solve it without a hierarchy.
You are right, Monty is basically the beta version of the platform mentioned here. Over the next months, we plan to gradually turn it into a more mature and stable code base and move more prototype or unused research code out of it. The idea is that tbp.monty will evolve from a research code base into a stable and easy-to-use platform that people can quickly use out of the box for their applications. We still have ways to go to get there, and appreciate any feedback (and contributions) if you start using it now already and run into issues.

I hope this helps!

Viviane

srgg6701 · June 17, 2025, 6:40pm

Hi Viviane!
Thanks so much for such a detailed and comprehensive answer. This kind of communication is extremely helpful in understanding Monty, its future direction, and the possibility of contributing to it. Let me respond inline:

we still have open questions on the theory side (including how to model object distortions,

Could you clarify whether to model object distortions is conceptually different from to identify objects that have been distorted?

…if you want to start building an application today, you will have to limit it to tasks that don’t require those capabilities or help contribute to the code base (which would also be awesome!).

I believe that if I manage to launch this project, it will already be a kind of contribution to the promotion of Monty. As we all know, even the most brilliant idea needs evidence of its applicability. Monty is certainly one of those ideas; perhaps even one of the most brilliant ideas that has ever been conceived in the history of human invention! This means that a demo implementing the strategic advantage that TBP offers will be very important for public and the AI community to quickly understand where to focus their attention and efforts. Of course, if I have a chance to contribute beyond my project, I’d be happy to do so (I may have already done so in hindsight, but as far as I can tell, the problem I was working on is not on your immediate agenda yet).

I’m sorry to say that we currently don’t have a finalized plan on how Monty will model object distortions (or how the brain does it), but we have been talking about it a lot in our latest research meetings, so hopefully we will come up with a good solution soon.
The “compositional objects” milestone on our research roadmap is not meant to include modeling object distortions. However, given our theory of how hierarchy works in the neocortex, we think that learning compositional objects will already cover a lot of cases of distorted objects (potentially all, but we need to test this). I’m not sure how interested you are in this, but in a nutshell, the idea is that the parent object will store the orientation and scale of the child object on a location-by-location basis. If there is a logo on a cup, we don’t just store one location of where the logo is on the cup. The logo-cup model in the parent column will store the logo feature and its pose at many locations. The logo could bend in the middle, and we would just store a different location and orientation for the logo on the parent model for those parts of the logo.

This is extremely interesting to me, so let me ask you a question right away. Perhaps I have misunderstood the context in which you use the term “hierarchy” or “object” or “to model object distortions” (though I have of course read everything about Monty’s approach). I have a couple of purely intuitive suggestions, based on everyday experience and addressing the above definition of “to identify objects that have been distorted”. The first suggests to me that the key to implementing the function is a mechanism for “computing” the topology of the object. If it is intact, then we recognize this object when it is distorted (how exactly remains an open question). Jeff mentions Dali’s painting in his book as an illustrative example.
The second suggestion is that we can recognize such a distorted object regardless of whether it is contained in another object or whether we see it each time, so to speak, in a vacuum.

In a few months, we will hopefully have a better idea of how well our hierarchy solution works for all kinds of object distortions and whether we need another mechanism (related to object behaviors that distort an object).

Solving the “distorted objects” problem will definitely be a major milestone in Monty’s development. Because it will prove that it opens the way to implementing real intelligence, something that, to my knowledge, no one has come close to yet. I have no doubt that you can handle it.

In regard to the compositional models milestone, we will likely need at least until the end of Q3 to have a fully tested and integrated version of what we plan to add to Monty for this. The DMC paper you have heard so much about has taken more of our time than anticipated (but it was well worth it, as you will hopefully see soon!).

I can’t wait!

If you can give some more details on your intended application, I could also maybe give you my thoughts on whether you could solve it without a hierarchy.

Yes, I would really appreciate your expert opinion on how feasible it is. And your guess on how long it might take to implement such a project would be invaluable. You can see the presentation [here](OneDrive d=cc02a7a1-c0a5-d000-07c8-1c1082d8fd8e&originalPath=aHR0cHM6Ly8xZHJ2Lm1zL3AvYy8zOTY2NGQ0ODYwNjg3YWMyL0VULWp0Und1YmVWSG1tSTdhRlRDTTE4QlFLSlM1QTRBVktseWpDUW4zcExWcGc_cnRpbWU9ZEhPZDFyMm8zVWc&CID=1f23adec-e842-4a71-bdb9-6faeb4e18afc&_SRM=0%3AG%3A53&file=HALS%20presentation.pptx).

I hope this helps!

It really does and I appreciate it much!

SY Srgg

vclay · June 20, 2025, 3:08pm

Hi @srgg6701

thanks for your kind words! It would definitely be a great contribution to our mission if you can showcase our approach in a practical application. I was going to have a look at what you are planning but when I click your OneDrive link it seems like I don’t have the right permissions for it. Is there another way you could share it? Maybe as a PDF?

Regarding your question on

whether to model object distortions is conceptually different from to identify objects that have been distorted

I don’t think those two are conceptually different. There might be a bit of a distinction depending on the context in which we use them. Sometimes we use the first phrasing in the context of object behaviors that distort the object. So basically, you are observing the distortion as it is happening, as opposed to encountering and recognizing a static object that is a distorted version of a model you previously learned. I am not sure yet how the solution for modeling either one will relate to each other.
I think we made some interesting progress on that question in our last research meeting (which will appear on YouTube soon), but no definitive answers yet. Your idea about somehow representing the preserved topology on distorted shapes is right on track (although we don’t know how this is done mechanistically yet). I agree that finding answers to those questions will be a major leap in Monty’s development and our theory

Best wishes,
Viviane

srgg6701 · June 23, 2025, 7:27am

Hi, @vclay

HALS presentation.pdf (1.8 MB)

I was going to have a look at what you are planning but when I click your OneDrive link it seems like I don’t have the right permissions for it. Is there another way you could share it? Maybe as a PDF?

Sure! The presentation is attached. Please don’t hesitate to suggest any improvements or share ideas — I’d greatly appreciate it!

I agree that finding answers to those questions will be a major leap in Monty’s development and our theory

Absolutely! If the outcome of this discussion is already reflected somewhere in your code, I’d be very interested to know where exactly. And of course, I’m looking forward to the video demonstrating what I assume is a major milestone in Monty’s development.

SY Srgg

vclay · June 24, 2025, 1:47pm

Hi @srgg6701

thanks for sharing the presentation. It looks super interesting! I’d be especially interested in learning more about the right side on the “theoretical foundations” slide (the two-level architecture and how it is influenced by Damasio and Solms work).

In terms of the demo you describe, it sounds like you are interested in modeling object behaviors, not object distortions (or maybe both?). At least learning and recognizing the movement of the cleaning robot and learning and recognizing the movement of the cat is what we would call a behavior model. Either way, we don’t have any code in Monty yet for modeling behavior or distortions (you could model distortions by quickly learning a new model of the distorted object) but hopefully we will have a clearer idea of how to add those capabilities soon.

For now, though, I think you could already do a lot of the things described in your demo. Monty should already be able to quickly learn a model of the warehouse, recognize partially occluded objects, and get a quick model of unexpected objects to navigate around. It should be able to do so without large amount of data or labels. We currently work on some capabilities for modeling objects in a compositional scene, so this should help as well.

As far as the cat, since it doesn’t sound like you need an explicit model of its exact behavior, you could simply have a motion detecting SM and use its information in your policy to avoid that area (not getting the LM involved at all). If you actually need to make accurate predictions about the movement (like the trajectory of the cleaning robot), you will need to wait until we add the behavior model functionality (likely not this year). But I think even without that capability, there would already be enough to implement, and it would already be a cool first demo

I hope this helps!
Best wishes,
Viviane

srgg6701 · June 25, 2025, 9:53am

Hi, @vclay!

I’d be especially interested in learning more about the right side on the “theoretical foundations” slide (the two-level architecture and how it is influenced by Damasio and Solms work).

Earlier this year I finished writing a paper, it is currently undergoing peer review at Springer. The preprint is publicly available.
If you are interested in this topic, feel free to leave any comments in the edited file.
I think it’s worth making a couple of clarifications. The problem discussed in the paper will become truly relevant when we can talk about creating AGI. There is already a lot of noise around this topic, but I personally share Jeff’s skepticism — in the sense that it is unlikely to achieve this through current mainstream approaches (ML). It is certainly possible that they will create some sophisticated system that will simulate truly intelligent behavior, without being so in essence (they are trying to achieve this in good faith). It is also possible that such a system will cause significant harm, but this will be a different problem than the problem of AGI friendliness. Nevertheless, sooner or later, those who are on the right path will have to think seriously about how to make AGI goal setting safe. Jeff also mentions this and suggests that we may need to think about replicating motives similar to those generated by the old mammalian brain (particularly in Chapter 10 / 2. Old-Brain Equivalent). The approach described in my paper essentially addresses this same issue. The idea is to:

create a mechanism for producing the primary motives of the AI (what the old brain does),
have the final goal-setting of the entire system determined by the limits of these motives, and
create a mechanism that constrains the very content of these motives.
The work of Damasio and Solms gives a hint on how to do this.

In terms of the demo you describe, it sounds like you are interested in modeling object behaviors, not object distortions (or maybe both?).

For now, I’m interested in the agent’s ability to recognize objects from a stream of sensory data and assign a special status to reactive objects (and, accordingly, to apply a special policy). The point of the demo is to show that the agent:

Can do without external datasets and navigate the world by building/updating its internal model. The spectator should see how objects of its environment appear in its initially empty world model (tabula rasa).
Distinguishes between static and “reactive” objects.

As far as the cat, since it doesn’t sound like you need an explicit model of its exact behavior, you could simply have a motion detecting SM and use its information in your policy to avoid that area (not getting the LM involved at all).

Yes, that will probably be enough.

If you actually need to make accurate predictions about the movement (like the trajectory of the cleaning robot), you will need to wait until we add the behavior model functionality (likely not this year). But I think even without that capability, there would already be enough to implement, and it would already be a cool first demo.

I hope so too! And lastly, let me return to my previous request. Is it possible to estimate how long it would take to create such a demo? Presumably for three developers, fairly skilled but not with much Python experience?

SY Srgg

vclay · June 27, 2025, 4:03pm

Hi @srgg6701

thanks for sharing that paper! Hopefully I will have some time for it soon, it sounds like a relevant topic.

In regards to creating such a demo it depends a but on your robotics expertise (I assume you want to do this with a real robot and not in simulation?) and how mature you want that demo to be. We recently did a 1-week hackathon where we built robotics prototypes with Monty (one example here: Project Showcase ) We will share videos and more lessons learned soon but that lego+raspberryPi project repository readme might be a good place to start. One week is certainly not enough to build a demo like you are envisioning (especially since I assume you don’t have as much experience working with Monty as our team had) but you could probably put together a basic prototype in 1-3 months. Again, it really depends on what exactly you want to accomplish and where you are okay with using some hacky solutions at first.

A good first step could be to go through the tutorial and think through which custom Monty classes you would need to implement and what they would need to do. Also making sure that you have a clear idea of what the sensor inputs would be and how movement of the sensors in the warehouse space are being tracked. I am also unsure about what the task would be that you want to achieve. Do you want to learn models of all the individual objects in the warehouse? Do you want to assess how well they and their poses are recognized? Do you just want to learn the spatial layout to be able to navigate in the environment? Just some things to think through that would make it more concrete how much customization of Monty would be required.

Best wishes,
Viviane

srgg6701 · June 27, 2025, 4:31pm

Hi, @vclay!
Thanks for your reply. Let me clarify our goal (sorry, I should have done this earlier). At this stage, our goal is formally simple: to demonstrate to those who will finance our project the advantages of Monty’s approach over current approaches used in robotics. Everything should happen in a 3D environment for now. So, the observer should see the following:

An agent appears on the scene.
The state of its memory can be observed, which should display a model of its environment. At first, the memory is empty (“tabula rasa”).
As the environment is investigated, objects present in this environment begin to appear in the agent’s memory.

image1232×643 91.1 KB
If objects move, the agent can identify them as those already contained in its model.
The agent approaches a “reactive” object. When the distance between them decreases to a certain point, the latter reacts to this approach (runs away) and gets assigned special attributes like:

image426×230 8.14 KB

That’s basically it.
Intentionally we wanted to add recognition of distorted objects, but for the first stage this is not critical.

vclay · July 9, 2025, 8:57am

Hi @srgg6701

sorry for the late reply, it’s been a busy last week!
It sounds like most of your first demo is about visualizing learned knowledge in a nice way. If it is all in simulation it should avoid a lot of the complexities of real-world robotics. The main challenge that Monty would be solving this this case is 3. and 4. (learning about objects and recognizing them as they appear in different locations). This should be doable although, we don’t have full support for learning scene representations of whole environments yet. Modeling compositional objects (like scenes) is something we are working on this quarter.

If you want to show Monty’s advantages over deep learning, one other thing I can recommend is this pre-print we just published: https://arxiv.org/pdf/2507.04494 (here is a presentation of the key results: https://youtu.be/3d4DmnODLnE) In it we systematically analyze and demonstrate a bunch of advantages of Thouand Brains Systems over deep learning including:

Robust object & pose detection
out-of-distribution generalization
data-efficient learning
Compute efficient learning (8 orders of magnitude more efficient than a ViT!)
continual learning
shape bias
symmetry detection
intelligent policies
multisensor collaboration

Maybe this is also helpful in convincing potential investors of the value of this approach.

Best wishes,
Viviane

srgg6701 · July 18, 2025, 5:19am

Hi, @vclay !
I’ve just finished reading the paper “Thousand-Brains Systems, Sensorimotor Intelligence For Rapid, Robust Learning And Inference,” which is actively discussed on your forum. It’s truly a beautifully written work and, indeed, appears to be a major milestone. My sincerest congratulations!
However, when considering Monty’s applicability in a real-world product, a question immediately arises for me: how will it communicate with the surrounding world? If it’s a robot, it needs to understand human instructions. Alternative approaches (like GR00T N1, which you also refer to in the paper, for instance) are already attempting to solve this problem, specifically through a VLM approach (ViT, as I understand it, being a core component). NVIDIA (Isaac Lab) and others have achieved tangible results, but ultimately, all their proposed solutions boil down to using massive pre-trained models. I share the view that this path is a dead end. Moreover, I’m not confident that robots built on this approach will ever be cleared for the market. VLM suffers from the same drawbacks as LLM—black box processing, hallucinations, etc. Consequently, it’s unclear how they intend to ensure the safety of the devices being developed. It’s one thing, you see, if ChatGPT starts talking nonsense during a conversation, and quite another if an autonomous robot begins acting erratically in the real world. I’m convinced that Monty’s advantage here could be decisive. Have you already considered how communication between Monty and the user might be implemented? If so, it would be great if you could share your thoughts.
Thank you in advance, and congratulations again on both crucial publications! :

vclay · July 21, 2025, 12:32pm

Hi @srgg6701 thank you for the kind words!

Communication between Monty and the user is an interesting topic. As we are not yet at the stage where Monty models language I can just offer some general thoughts on the topic:

The only outputs of the brain are actions. Whether those are commands to move limbs, commands to move the eyes, or commands to move the muscles of the mouth to produce language or the hand to write or type. All of them are motor commands.
Similarly, Monty’s LMs outputs go to the motor system, which then outputs actuator-specific motor commands. Since we are not dealing with a biological system we have a few more options on the kind of motor systems we have. You could potentially have a motor system that outputs binary code or tokens. Or it could have prelearned primitives of how to write letters or sound out words instead of having to learn this from scratch. But this is more about quick, specific solutions, not a requirement.
To tell Monty what it should do we currently imagine to use “goal states”. This is a message in the CMP format that specifies the state in which the world should be (e.g. “I want cup at location x” or “I want cup to be filled with coffee”). The LMs can then break this high-level goal down into subgoals (e.g., “I need the agent to go to the kitchen”, “I need to start the coffee machine”,…) and use its internal, structured models to figure out how to achieve them through outputting actions.
To specify the highest-level goal state, you could potentially use an LLM that translates a natural language query into a CMP signal. Eventually, Monty should be able to model language itself though and turn language into a goal state (but this is far out on our research roadmap).
Side note: Currently, Monty’s only goal is to model the world and infer what it is sensing. This is its intrinsic desire to learn and resolve uncertainty, and not any specific task provided from outside. For its output, we basically insert an electrode into Monty’s brain and measure the representation that an LM outputs (to the next higher level LM, not to the motor system), which is its classification of what object and pose it currently senses (in the form of a CMP signal).

Sorry if this goes into more detail than you asked for. The short summary is that in the short-term it might be useful to use LLMs as an interface to translate between human language and Monty’s language (CMP). However, in the longer term Monty should also be able to model, understand, and output human language itself (using the same principles as our brains do).
In case you are interested, I wrote a couple more thoughts on language in Monty here: Abstract Concept in Monty - #4 by vclay You can also checkout a discussion here on using LLMs with Monty, and the potential challenges (Hello, and thoughts about leveraging LLMs in Monty - #3 by avner.peled)

Best wishes,
Viviane

srgg6701 · July 21, 2025, 3:55pm

Hi, @vclay

Indeed, your response seems more generous than my question — and I like it.
This is an extremely interesting — and, I believe, important — subject, especially since language:

connects inner representations of the world with external reality and other agents
serves as a means to direct the actions of agents (when possible and appropriate)

I left a comment in another topic, as that seems like a more appropriate place for this discussion.

Thanks again for your insightful response!

Topic		Replies	Views
2025/09 - Brighton Retreat Final Presentations Video Discussions brainstorming-video	16	203	January 21, 2026
Design of Final-Year Undergraduate Project with the Inclusion of the TBP! Projects	28	400	January 8, 2026
Possibility of sharing Monty Learnings General	14	353	May 14, 2025
The road to a Generally-Intelligent Monty? Research and Theory	7	117	January 9, 2026
Does Monty Need a Motivation to Drive Model Learning? Research and Theory	6	156	July 26, 2025

New Tutorials on Using Monty in Custom Applications

Related topics