Monty Audio Bot on Phone

Trying to think of how to use my phone to run an always on process with existing sensors running Monty to test it’s continuous learning capabilities.

I was thinking the least-power, minimal set of sensors would be a (multi or stereo?) built in mic and accelerometer and GPS would perhaps be enough to model objects that produce sound interacting throughout the world.

I guess I’m just looking for feedback on some questions:

Can audio as sensor data be enough to model things? (I know a lot of the team has been using vision and touch as main modality to think about complex sensors). You can’t really send motor commands to a mic so unclear if this is even a good primary sensor to use as compared to a finger or scading eye. Maybe moving the whole phone so you could sense if the audio is coming left or right side, but that would require motor control over the device moving in space which is now a full robot..

What would the input to the sensor module be? Would I have to get it to FFT segments or wavelet transform as higher level input?

1 Like

Hi @Jeremyll ,

That’s an interesting question. My answer would be: it depends on what you want to model.
You can definitely use sound as a sensory modality for Monty, but it needs to be coupled with location information. You could extract a rough location of a sound from the temporal delay between two microphones (like our two ears) or the change in sound as you move one microphone around, but I am not sure how easy this is to do with a phone.
I am also unsure what kind of sound objects you would learn from this data. Do you have a concrete example of what you would want to recognize, and how sound at different relative locations plays a role there?