Habitat IPC with posix shared mem

Hello,

Sharing a small update about the IPC version of Habitat in Monty I mentioned in a recent post. I wanted to close that chapter with a realistic transport, so did some extra work.

Before delving into habitat-ipc, I programmed a python library to do rpc over shared memory, with an eye towards using it in Monty. Could not find anything online I liked, so I built my own. As the work progressed, it became obvious that habitat-ipc would need direct access to the byte transport underneath the rpc api I had created, but I decided to advance Monty towards ipc before getting back to that. Now, I finally came full circle and adapted this library to expose that shared-memory transport, using it to replace in Monty the multiprocessing.queues transport I presented before. All tests in Monty are now green with this more realistic transport.

The only caveat is that I’m not 100% sure this works in Mac. I don’t have a Mac, I work in Linux (WSL actually), and the best I could do to test the lib in macos was using the github mac wrappers, a poor proxy for such low level features. I really don’t want to rent cloud instances for this…

Putting that detail aside for now, on my machine I was able to observe the following :

  • tests took 12 mins, vs the original (non-ipc) 10 mins. This new version actually spawns new simulator processes, instead of just forking them as the multiprocessing.queues version did, but I guess that overhead is diluted during test execution.

  • the randrot_10distinctobj_surf_agent experiment took 587 secs, a negligible difference to the original version - 546. I need to profile an experiment again with this pipe. Last time I profiled an experiment, the pain point was inside the lm if I recall correctly.

  • I had to expand the pipe to ~700K in order to let the biggest message in the tests through.

3 Likes

having this pipe provides a cool viewpoint over the passing flows. Some observations:

During tests:

  • 111 simulators are created.
  • the full incoming traffic from those is 1,723,306,602 bytes = 1.7 GB.
  • the traffic in the other direction, from Monty, amounts to 2.25 MB.
  • exact size of max response from simulator = 614,638 bytes.

The experiment randrot_10distinctobj_surf_agent :

  • creates only one simulator as expected,
  • generates 1,170,607,357 bytes (1.17 GB) to Monty,
  • and sends 818,350 bytes to the simulator.

This makes me wonder whether there could be room to consider an efficiency metric based on the amount of information extracted from the world during an experiment, which in this case, given that the experiment took 587 secs (on my pc), would amount to about 2 Mb/sec. Not sure there is anything in biological reality to compare numpy arrays with, though :slight_smile: but perhaps it can help in discussions related to scaling.

3 Likes

Hey @nunoo, this is great, thank you for putting it together. It is very encouraging to see that things do not get much slower with a proper transport implemented. Also, the insights you’re gathering from observing the data transfer give us an investing perspective as well.

I’m not sure when I’ll be able to test this out on a Mac, but maybe someone from the community will get to it before I do.

I know the focus here has been on Habitat, but I also see it as a great example of connecting to any remote environment with the transport abstracted away. Really cool to see.

4 Likes

Thanks for the support Tristan!

You inspired me to ask Claude to create a transport for zeromq. It’s a bit slower, 10% slower on the randrot_noise_10distinctobj_dist_agent experiment, but nothing drastic. And it’s 100% Mac-friendly :grin:, so one less barrier.

Step by step, we’ll get there :wink:

4 Likes