New Motor Policy Selection Framework

In PR 889, we added a new layer to the the motor system and policy framework. Previously, the MotorSystem was configured with a single MotorPolicy. In order to allow Monty to dynamically choose its policy, we introduced the MotorPolicySelector protocol. Instead of a MotorPolicy, the MotorSystem is now configured with a single MotorPolicySelector, and it is the selector that is now expected to contain one or more policies.

For those of you who aren’t planning to work on policies in the near future, the most important thing to be aware of is how this changes experiment configurations. Our benchmark configs have been updated to use a SinglePolicySelectorconfigured with a single policy. You’ll likely need to make the same adjustment if you’re using older configs and want to merge main into your branch.

Beyond that, you can look forward to more elaborate usages in the near future. We are currently working on a new policy meant to handle goals originating from the SalienceSM sensor module. At the same time, we’ll be adding a policy selector that can use this new policy for SM-derived goals and use a different policy for LM-derived goals.

2 Likes

We’ve made an update to how motor system, motor policies, and motor policy selector interact in refactor!: policy selector calls policy by scottcanoe · Pull Request #915 · thousandbrainsproject/tbp.monty · GitHub.

Motor policies are now encapsulated inside motor policy selectors. The motor system invokes the motor policy selector, which, in turn, invokes policies and returns an appropriate motor policy result to the motor system.