Does Monty Need a Motivation to Drive Model Learning?

xiaowenhao · June 23, 2025, 2:12pm

Hello, I am a beginner just starting to work with Monty, and I have a question regarding the Learning Module Outputs. I noticed that there isn’t an explicit motivation, and I am curious if this implies that the “motion” sensors are mainly used for improving the predictive model. Does this mean that the prediction motion stops when the model and actual data align? In this case, is the “building of the model” the motivation behind Monty?

I also noticed a question regarding “how to handle moving objects,” and I was wondering if the “motion” sensors should be separated from the “prediction model” to better handle and train the model. This separation would allow Monty to distinguish between movements caused by time and movements caused by motion itself, making it more capable of handling moving objects.

For example, if I input the position of the end-effector of a robotic arm as input and the six joint angles as output, the “motivation” should be to move the end-effector to a specified target point (not in a reinforcement learning sense, just as a task). Through this motion, Monty would establish the relationship between the joint angles and the end-effector position (possibly similar to the Jacobian matrix, as the relationship is linear, but it could be something else). As Monty using its model to realized its goal, he will updates its model by detecting the discrepancy between the predicted action result and the actual behavior, it would refine this model.

I think this might be achievable, and it is easy to simulate the robotic arm in ROS, next work might be use Monty to learn this relationship. The next step could involve Monty learning the dynamics of moving objects, like when a ball is moving in relation to the end-effector. Monty would need to adjust the joint angles to keep the ball stationary in space. Finally, Monty could learn more abstract nonlinear relationships, like controlling a large magnet that attracts small magnets to specific positions.

I have this question because, in the Thousand Brains theory, motivation is provided by the old brain. The old brain drives behavior based on biological needs like survival and reproduction. I am wondering whether Monty, as a model, would also need some form of “motivation” to guide its learning process and drive its model updates. This could also be important for evaluating the success of the learned world model.

nleadholm · June 24, 2025, 4:04pm

Hey @xiaowenhao , welcome to the forums! Those are good questions. A lot of it depends on whether discussing how Monty currently works, vs. our long-term vision.

Re. motivation in Monty at the moment:

Your general understanding is correct - Monty moves in order to either recognize or build models. In our current experiments, typically the episode ends once the object is recognized, although this is not always the case. This is largely due to constraints around what we actually have implemented.
In particular, adding hierarchy, as well as the ability to switch policies, is still a work in progress, so each LM has a pretty simple internal “motivation” driving how it acts in the world. For simplicity, this basically consists of learning about objects, and inferring objects. More specifically, we have lower-level policies that the LM can engage during learning (like systematically moving over the object, i.e. the scan policy), and inference (like rapidly exploring the surface of the object).

Re. motivation in future versions:

One could argue that in the absence of any outside drive, an LMs default motivation is to try to predict what it is seeing (inference), and if it doesn’t understand what it is seeing, learn about it. This is coarsely captured with the baked-in policies we currently have, and something like this will probably continue to be present in our LMS in the future
In the long-term however, we imagine that goal-states will generally drive the motivation of an LM, and that these will come from higher level LMs. For example, an LM that knows about coffee machines might receive the goal-state of, “brewed coffee is in the machine”. This might have come from an LM that is trying to reduce the mental fatigue of the agent. The coffee-machine LM might then send goal-states to yet other LMs in order to achieve its driving goal, or it might interface directly with the motor system.
In the brain, the top-level driver of motivation is indeed likely to be “old brain” structures like the hypothalamus and basal ganglia. However in Monty, the top-level goal state might be set by an external source, like a human programmer.

Re. your comments on moving objects:

I would find it helpful if you were to draw out your specific proposal. However, as you may have seen in some of our recordings, we are thinking a lot about separating out behavior modeling from morphology modelling. I think this aligns well with your general intuition.

brainwaves · June 24, 2025, 5:36pm

There is also a YouTube playlist for all the brainstorming behavior videos (its a lot of content!).

xiaowenhao · June 24, 2025, 7:31pm

Hello @nleadholm, thank you for the detailed explanation!

Regarding the moving objects, I may not have fully explained my proposal earlier. As I mentioned, the plan is to first learn the movement pattern from the “joint angles” to the “robotic arm,” just like a human learning his own body’s coordinate system to control his arm. Then, the next step is to learn interactions with the external environment, focusing on time-based movements. For example, imagine a string attached to a ball at the end of the robotic arm. When the arm moves, the ball is pulled by the string, and as the arm reaches a specified position, the ball will continue to swing.

Since the coordinate system of the arm has already been learned, the idea is to use coordinate transformations to determine what the state of the ball should be when the arm moves (as if the ball were stationary). However, since the ball is actually moving, this difference can be used to learn the motion model of the ball. In other words, to subtract the sensor influence caused by the arm’s movement. This allows us to isolate and learn the motion model of the ball.

I’m now working on integrating Monty with ROS. My goal is to add a ROS topic subscription in the SM module to receive sensor data and expand the motor system to send a topic. While I’m not sure if I’ll be able to implement this, I believe that once it’s working, I will be able to simulate Monty’s real-world movements and let it see and touch objects in a simulation environment (for example, using a stereo camera and force sensors at the arm’s end). This would also make it easier to expand Monty onto a real robot in the future.

Regarding the brainstorming videos you recommended, I have to admit that I may not yet have enough background knowledge to fully grasp some of the concepts discussed. However, I’ll definitely pay more attention to them and continue to learn. I’m excited to dive deeper into these ideas!

nleadholm · June 25, 2025, 8:40am

Nice, thanks for expanding that description @xiaowenhao - I think what you describe sounds right, that we can use learned models of our own body parts and their behaviors to predict the state of the world. As you say, if there is an instance where there is some prediction error, then this can help inform a new model.

We think it’s likely that learning models of the body’s limbs/actuators is what the classical “motor” cortex is doing. For clarity, this is not something that Monty can currently do, since we’re still figuring out dynamic, time-dependent models, what we put under the umbrella term of object behaviors. That said, it’s definitely an exciting question to explore, hopefully in the near future.

In the meantime, that sounds really exciting about integrating Monty with ROS! Please let us know if you run into any particular issues that we can support you with when working on this. I imagine there might also be other members of the TBP community interested in this, in case you want to discuss your project more widely on the forums. Lastly, we recently did an internal robotics hackathon, for which we’ll be sharing more material/results soon. While it did not make use of ROS, it might provide some fun inspiration for the kinds of things that Monty can currently do in a robotic setting.

tslominski · June 25, 2025, 4:53pm

Hi @xiaowenhao, I’m excited to hear about a ROS integration.

Regarding the part about expanding the motor system to send a topic, I recently put together the Implementing Actions guide that might help with deciding where to transition from Monty-specific data structures to a ROS topic.

If I were sending actions to a ROS topic, I would implement a ROSActuator inside a ROSEnvironment implementation. Something like:

class ROSActuator:
    def __init__(self, ...) -> None:
        # ...
        self.topic = ...

    def actuate_look_up(self, action: LookUp) -> None:
        msg = ...
        self.topic.publish(msg)

    def actuate_look_down(self, action: LookDown) -> None:
        msg = ...
        self.topic.publish(msg)

Then the ROSEnvironment would be something like:

class ROSEnvironment(EmbodiedEnvironment):
    def __init__(self, ...) -> None:
        # ...
        self._actuator = ROSActuator(
            topic=...,
            ...
        )
        # subscribe to sensor data
        self.subscriptions = ...

    def _observations(self) -> Observations:
        # either synchronously ask for sensor data
        # or return the last data received from the ROS topic subscription
        # ...
        return observations
  
    def step(self, action: Action) -> Observations:
        action.act(self._actuator)
        return self._observations()

You’ll notice that ROSEnvironment.step() takes an Action and returns Observations. This is where there will be sync/async conflict in the framework. In this example, I call to _observations() method which can either synchronously request current state, or return the latest sensor data received on the topic. We haven’t explored async interactions in Monty yet, so if you do this async, it will be a new mode of operation for Monty.

ExtropianMind · July 26, 2025, 7:25am

Did you consider implementing ROCK instead of ROS? I work at an automotive manufacturer and we work with ROCK.

I guess ROS is a good starting point for rapid prototyping.

–

Advantages of ROCK over ROS

ROCK (Robot Construction Kit)**|Comparison to ROS|

|Modularity and Component Model|

Stronger enforcement of modular, reusable components with well-defined interfaces through RTT/Orocos.|ROS allows modularity, but with more flexibility (and potential inconsistency).|
|Real-Time Capabilities|Better support for real-time systems via integration with Orocos RTT (Real-Time Toolkit).|ROS 1 lacks real-time; ROS 2 improves it, but not as mature as ROCK with Orocos.|

|System Integration|

Designed for complex system integration**, suitable for long-running, high-reliability applications.|ROS is easier to start with but can be fragile in large systems without additional effort.|
|Component Lifecycle Management|Includes a clear component lifecycle model, making it better suited for production-level robotics.|ROS 1 has limited lifecycle management; ROS 2 introduces it but is still maturing.|

|Tooling for Deployment & Configuration|

Syskit and Roby offer advanced deployment, planning, and orchestration tools for complex robots.|ROS has roslaunch, but lacks deeply integrated orchestration.|

|Data Logging and Replay|

Has built-in logging, replay, and validation tools at the component level.|ROS has rosbag, but with less deterministic replay.|

|Safety and Determinism|

Designed with determinism and safety in mind — important in industrial and space robotics.|ROS is generally more research- and prototyping-focused.|

|Industrial/Scientific Use

|Popular in aerospace, subsea, and industrial robotics where reliability and determinism are critical.|ROS dominates in academia, prototyping, and service robotics.|

|Language Independence|

While mostly C++, it enforces strict interface standards that help multi-language interoperability.|ROS supports multiple languages, but interface consistency can suffer.|

Summary

Use ROCK when you need real-time, high-reliability, safe, and complex system integration—especially in industrial or mission-critical robotics.
Use ROS when you prioritize rapid development, community support, and ease of use, especially in research and prototyping.

Topic		Replies	Views
New Tutorials on Using Monty in Custom Applications Monty Code	16	357	July 21, 2025
Possibility of sharing Monty Learnings General	14	280	May 14, 2025
I'd hoped to be able to end the non-sense Monty Code	5	326	March 7, 2025
Abstract Concept in Monty General	9	361	April 18, 2025
Monty + Reward Free Empowerment for Intrinsically Motivated Autonomous Agents Research and Theory	1	44	December 9, 2024

Does Monty Need a Motivation to Drive Model Learning?

Summary

Related topics