I started a personal project a couple years ago, prior to learning of the thousand brains model in July of last year. The project could be thought of as a digital brain, composed of virtual “neurons” visualized like so:
The connections here are recent message-passing channels, since my “neurons” are actor-model actors. The actors/neurons themselves are all explicit code (although some of them encapsulate Whisper for transcription or Rasa open source for entity extraction). I’ve seen several similar projects centering LLMs, but I’m trying to center knowledge via my existing personal knowledge management system instead.
My explicit code is primarily in Scala, using Akka 2.6’s Behaviors DSL to create behaviors that send and receive messages and do things like call APIs, but we need more than just behavior, don’t we? From Jeff’s book:
The secret ingredient, if you will, is that intelligence is created through thousands of small models of the world, where each model uses reference frames to store knowledge and create behaviors.
My small models / knowledge are stored as a Markdown wiki - as indicated somewhat by the purple icons, daily notes and lists/hubs
. I have tens of thousands of Markdown notes, many of which are linked “atomic” notes, but closer to 100 virtual neurons deployed for day-to-day use. Getting to “thousands” of these will require lots of tinkering to leverage my Markdown notes for behaviors, so that I can write (and manage) less Scala.
I had been meaning to look more into how voting works in Monty, and was lucky to find this comment which wrote in part:
Voting is a method by which one LM receives a summary (in the form of hypotheses on objects and poses) of the observations processed from other LMs
I was worried about sending too many messages, but after that comment, I figured experimentation is better than holding back. In this screenshot, the pink speaker sends a broadcast out to multiple “listeners” who might want to vote on the transcription after receiving it.
Even with relatively few listeners here, I had a problem - short “remind me…” voice notes were sometimes being picked up by the Rasa model for my smart lights, causing them to change randomly on occasion. I could have tinkered with the Rasa training data, but this seemed like a good chance to experiment with voting.
Every transcription has an associated NoteId which the listeners use to coordinate votes. It’s painfully rudimentary right now, but the votes essentially allow listeners to “claim” a transcription via its NoteId. They mostly ignore each other’s votes, but the lights listener doesn’t change the lights if the “remind me” listener already claimed the respective NoteId.
Because the lights listener makes a call to Rasa via Flask, my dead-simple implementation has worked very effectively - the “remind me” listener doesn’t have to make an HTTP call so it’s naturally faster, and the lights listener doesn’t need to slow down since it’s already slow. It goes against my habits to accept emergent behavior and imperfect code like this, but emergent behavior is exactly what I hoped for from voting - just much, much later
(Picture to break up the text - you can see the messages to my smart light bulbs on the right. Voting is not yet visible here.)
There’s one more thing about this implementation that I want to unpack, because every voice note is transcribed twice. The “remind me” listener was subscribed to Whisper large (slow, accurate) but not base (fast, inaccurate) because speed was not important. The lights listener on the other hand subscribes to both, and just makes both calls to the Hue API if they both match on Rasa. This way it’s fast, but usually corrects itself as needed. This meant though that voting only worked if I simplified away the fast-but-corrected behavior, which I tried and disliked.
So, I updated the “remind me” listener to listen for both transcriptions as well, and everything has worked remarkably well since. It’s obviously not perfect, but in practice for my personal project it’s actually worked flawlessly. It’s funny, because I was disappointed initially not to have a great Monty analog for sampling over time but a fast transcription followed by an accurate one was perfect, not too complicated.
When I started this project, I had in mind wanting to think more like a scientist - that is, making hypotheses and testing them. Even with my transcribed voice notes and wiki, this is still quite laborious and difficult to scale. The aspect I’m most worried about is de-duplicating hypotheses automatically, which I’d rather try to do with voting than LLMs.
Some hypotheses can get events every minute (e.g. air quality) or second (e.g. heart rate), and others possibly less than one event per year. If you’ve heard of “building a second brain” then this might be more like “building a thousand (tiny) brains” but I really want to deploy a thousand hypotheses concurrently. I know Monty votes on hypotheses, rather than my “claim” thing, but I haven’t thought of a good generic model yet for anything like that. I’ve also struggled to come up with a science workflow, or anything like an OODA (observe, orient, decide, act) loop.
I haven’t needed object or pose recognition, but I can see deploying Monty and perhaps LLMs and similar systems as “neurons” to be integrated into my larger system. I do have one that looks for Ollama prompt files and can run them locally, but there’s no conversational implementation and the models themselves are limited by my hardware so
If you read all that, I welcome any comments! I’m curious if anyone is working on anything similar they want to share, or if there’s any reference to a podcast, blog, some code, or anything else that you might want to recommend I checkout. My background is in backend and data (software) engineering, so deep math and ML are often beyond me