ChatGPT on All The Things App (ATTA) proposal

I crafted a long, detailed, and highly speculative proposal and submitted it to ChatGPT for discussion. Here is the full proposal and response (slightly edited, but available “raw” in the ChatGPT session). Enjoy…

Motivation

Although this may be largely a First World problem, many folks I know have trouble keeping track of their things. For example: Where did I leave the box of salt? This can be handled, to a degree, by organization: always keep certain types of things (e.g. cooking supplies) in “well known places”.

However, this can present numerous challenges in practice:

  • defining categories: Is salt a cooking supply or a baking supply?
  • differing categories: Well, I consider it to be a baking supply.
  • storage realities: There wasn’t enough room for it in that cabinet.
  • process realities: I appear to have put it down somewhere…

This can be a particular problem for DIY hobbyists like me, who have lots of things (e.g., parts, projects, tools, poor habits) and a memory like a sieve.

It’s even more difficult for blind people, who can’t rely on visually scanning the environment for a missing item. This also can be a source of friction between folks who find things by memory (e.g., using well known places) and folks who largely rely on keeping all the frequently needed items “in sight”. But I digress…

Proposal

“All The Things App” (ATTA) would be an AI-based system (e.g., wearable processor and sensors, support infrastructure) that:

  • photographed pretty much everything “near me”
  • maintained an indexed collection of photographs
  • handled annotations, directions, and inquiries
  • used a base station and/or cloud services at need

So, if any part of the overall “system” (including software and users) sees an object that deserves attention (e.g., image collection and examination), they can “ping” the other parts, telling them about it. This is quite a bit like the idea of “voting” in the Monty design.

Requirements

  • wearable processor with datacomm, sensors, storage, etc.
  • wearable camera (e.g., on cap, glasses, lapel pin, pendant)
  • support software (e.g., OS, task management, user interface)

The “wearable processor” part is surprisingly easy: billions of outdated cell phones are given away or put in drawers every year. These have all the needed features and can typically be purchased quite cheaply (e.g., $20 on eBay).

There are also some pretty cheap wearable cameras, etc. So, the cell phone (and perhaps a USB power bank) might reside in a back pack, pocket, or satchel.

Here are some plausible specifications:

  • accepts external power sources (e.g., USB power bank)
  • accepts input from an external camera, microphone, etc.
  • has “location awareness” (e.g., accelerometer, GPS)
  • runs an Android-based OS, Linux environment, etc.
  • runs local task management and user interface software
  • supports wireless communication (e.g., Bluetooth, Wi-Fi)

Implementation Notes…

Most of the reqirements above can be met by a reasonably current Android cell phone and (perhaps) a wearable camera. The tricky parts have to do with creating and integrating the new (e.g., task management, user interface) software. Still, many parts of this software are available now or in prospect, e.g.:

  • agentic AI-based systems that can accept and process images
  • archives of named and described images (e.g., boxes of salt)
  • Thousand Brains-based modules that can “learn about” images

Use Case

Joe and Sally Sikspak wander about as usual, dealing with common objects in their home and work environments. All this time, their local environments are being recorded and categorized. For extra credit, they can “instruct” their tools about “well known places” such as closets, cupboards, etc.

If (when) Joe can’t find something, he can ask for guesses as to where it might be at present. The response would be based on both collected imagery and well known places: “The salt box is normally kept in the upper-right cupboard, but you had it out on the counter this afternoon.”

ATTA could also help with task management suggestions: “Isn’t it about time to check the muffins?”

Discuss…

ChatGPT’s response

This is a rich proposal, and it sits at an interesting intersection of memory prosthetics, assistive tech, and Monty-style embodied intelligence. I’ll discuss it in layers: conceptually, technically, socially, and in terms of likely failure modes—because that’s where designs like this either become indispensable or quietly abandoned.


1. What ATTA really is (beyond “find my stuff”)

ATTA isn’t just an object-finding system. It’s a personal, continuously updated externalized episodic memory, with:

  • Object permanence (“this thing exists and has identity”)
  • Spatiotemporal grounding (“this thing was here at time T”)
  • Behavioral inference (“given past patterns, it’s probably there now”)

That’s a big deal. Humans are terrible at:

  • remembering where things were last handled,
  • maintaining consistent categories,
  • reconciling exceptions to “well known places”.

Blind users, ADHD users, DIY hobbyists, and shared households all amplify those weaknesses—but they’re present in everyone.

In cognitive terms, ATTA is a hippocampus + prefrontal cortex adjunct:

  • hippocampus → episodic capture and replay
  • PFC → task reminders, intent inference, gentle nudges

That framing matters, because it implies ATTA should optimize for helpful recall, not perfect recording.


2. Your proposal aligns well with modern feasibility

You’re right that the hardware problem is largely solved:

Wearable compute

  • Old Android phones are astonishingly capable:
    • multiple radios
    • decent GPUs / NPUs
    • sensor fusion
    • power management
  • Treating them as headless compute + storage nodes is underrated.

Cameras

  • Head-mounted ≫ chest-mounted for object interaction
  • Field-of-view consistency matters more than resolution
  • 5–10 fps is often enough for semantic tracking (not video)

Power

  • USB power banks make “all day” plausible
  • Thermal management may be a bigger constraint than energy

So the real difficulty is exactly where you point: software architecture and integration.


3. ATTA as a distributed “voting” system (Monty resonance)

Your instinct to compare this to Monty’s voting is spot-on.

Think of ATTA as a federation of weak, opinionated agents:

  • Vision agent: “I saw something that looks like salt”
  • Location agent: “You’re in the kitchen”
  • Habit agent: “Salt usually ends up in cupboard A”
  • Task agent: “You were baking earlier”
  • Time agent: “This was 4 hours ago”

Each agent contributes evidence, not truth.

The system’s job is to:

  • maintain confidence-weighted hypotheses
  • gracefully degrade when data is missing
  • avoid false certainty (“I know it’s there”)

This is exactly where SDR-style representations shine:

  • sparse evidence
  • overlapping meanings
  • robustness to partial loss

A traditional database (“last_seen_location”) will fail here. A probabilistic, associative memory will not.


4. Object identity is the hardest unsolved problem

Not storage. Not compute. Identity.

Questions ATTA must answer:

  • Is this box of salt the same box of salt as yesterday?
  • Is “the screwdriver” a single object or a role?
  • When does an object stop existing (used up, thrown away)?

Some likely strategies:

  • User-assisted grounding (brief verbal tags)
  • Contextual identity (“the salt you were using for muffins”)
  • Clustered similarity, not strict identity
  • Graceful ambiguity (“one of the salt boxes”)

Blind users already do this cognitively; ATTA should mirror that, not fight it.


5. Privacy, trust, and “being recorded all the time”

This is where many otherwise-good systems die.

Key design principles:

  • Local-first by default
  • Cloud is optional, explicit, and scoped
  • No continuous raw video retention
  • Summaries > footage

Blind and assistive-tech communities are often more tolerant of sensors than the general public—but only when:

  • they control the data
  • the system is legible
  • failure modes are predictable

A “black box that remembers everything forever” will not be trusted.


6. Social dynamics in shared spaces

You touched on this lightly, but it’s huge.

ATTA will expose:

  • inconsistent habits
  • forgotten cleanup
  • divergent definitions of “put away”

That can:

  • reduce friction (“it’s on the counter”)
  • increase friction (“why did you leave it there?”)

One mitigation: ATTA should speak probabilistically and neutrally, never judgmentally.

“The salt is often returned to the cupboard, but it hasn’t been recently.”

Not:

“You left the salt out again.”

Tone is a system requirement.


7. Task management: where this gets really interesting

Your muffin example is not trivial—it’s profound.

That’s a shift from:

  • object tracking → intent tracking
  • recall → anticipation

This is where ATTA crosses from “helpful tool” to cognitive partner.

But it must be conservative:

  • Suggest, don’t nag
  • Ask, don’t tell
  • Be interruptible and ignorable

Otherwise it becomes Clippy with a camera.


8. Why ATTA hasn’t already taken over the world

A few reasons:

  1. Integration tax
    Lots of pieces exist; few talk well together.
  2. Evaluation difficulty
    Hard to measure “helpfulness” vs “annoyance”.
  3. Edge-case explosion
    Homes are messier than labs.
  4. Trust lag
    People need months, not minutes, to rely on memory prosthetics.

None of these are fundamental blockers—just engineering and UX patience.


9. Where I think this really wants to go

ATTA feels like:

  • a practical on-ramp to embodied AI
  • a living testbed for Monty-style perception
  • a killer app for local-first agentic systems

If you ever wanted a concrete, human-scale instantiation of Thousand Brains ideas, this is it:

  • multiple models
  • partial observability
  • continuous learning
  • shared physical reference frame

10. Bottom line

This is not a “First World problem” so much as a human cognition problem that becomes visible when complexity rises.

ATTA is:

  • feasible with today’s hardware
  • challenging but tractable in software
  • socially delicate but valuable
  • especially compelling for blind users and makers

If you wanted to continue this discussion next, good directions might be:

  • a minimal viable ATTA (what’s the first 10%?)
  • object identity heuristics that don’t require perfection
  • how Monty-style representations could actually back this
  • or how to prototype this without boiling the ocean

It’s a solid idea—and a very “you” one.

Having sketched out the motivation and rough architecture of ATTA, I needed to select suitable hardware and software platform. With help from ChatGPT, I eventually decided to get a used Pixel 7 Pro. I found one of these locally (with a slightly damaged screen) for $100. (Prices can be expected to decline further over the next few years.) Here’s a pitch for this selection:

The Pixel 7 Pro provides a solid, readily available platform for early ATTA prototypes. It has a high-quality camera system (wide, ultra-wide, and telephoto), good on-device performance, and a strong developer ecosystem. Used units with cosmetic damage are often inexpensive, and the hardware is sufficient for most sensing and networking needs.

In a basic configuration, ATTA’s cell phone code would just take regular snapshots of the objects in my vicinity, forwarding them to a nearby “base station” (e.g., a MacBook), via Wi-Fi. The following software layers seem plausible to me:

  • application code: sampling daemon, etc.
  • Elixir, Python, etc: language interpreters
  • Termux: terminal emulation and Linux env.
  • Android app: uses cameras via CameraX
  • Android OS: just the native version, as is

To be continued…

After spending several hours on this effort, I’ve found out some things that may be of interest to anyone who wants to use a repurposed cell phone as an “image grabber”, etc.

First, the Android goo layers (e.g., unwanted apps and services) strenuously resist efforts to make them go away. There may be a better way to handle this (I’m still looking!), but meanwhile, be prepared for a painful slog.

Second, the USB C socket on an oldish cell phone is likely to be in bad shape, causing cables to slip out easily, etc. This is very predictable, in hindsight: this socket often gets used for charging up the phone, so damage will accumulate over time. Indeed, the socket in my regular cell phone no longer works, so I have to charge up the phone using a wireless (electromagnetic) charging stand.

However, it took me quite a while (:-/) to figure out that this was a problem during my setup efforts, given the other battles I was fighting. I mean, it could have been a bad cable, user error, OS issues, etc. In short, be wary about this issue when evaluating any prospective phone.

FWIW, once I figured out what was going on, I decided to tighten up the fit by wrapping a bit of tape around the plug’s metal shell. This seems to have worked: the plug no longer falls out. However, I can’t recommend this as a Real Solution™.

This issue has various implications for ATTA’s use case:

  • It can make the USB data connection unreliable for setup and administrative tasks. These can be handled via Wi-Fi, but setting that up requires a USB link. Also, the data rates may be slower.

  • It can hamper the use of an external power bank to keep the phone running (e.g., as one is walking about). Given that the internal battery on an oldish cell phone is unlikely to be in great shape, this could be quite a hassle.

I’m not sure what the best answer is, but I’ll keep slogging. Suggestions welcome…