I’ve watched the Brainstorming video, and would like to add my own ideas to the brainstorming session, and hoping to have people consider other possibilities as well.
From Jeff’s book, there is a quote where he says that a single neuron can’t do much on its own. But when you put many of them together, magic happens. In the same way, I get the sense that we are trying to cram too much into a single Learning Module and have it do too much. I think we can solve the deformations problem more easily if we use higher level LMs. So instead of one layer of LMs, we would have several, and together they can identify deformations.
So I think we could have one single LM be able to recognize a full object by itself. But we don’t have to get a single LM to learn deformations by itself. The system could only support deformations only when multiple LMs are working together.
Also, I think Jeff was right in the video, we are missing some key concepts.
We are not thinking about the problem correctly. Here are some ideas I think are missing…
What if a cup is not a cup? We are thinking of a cup as a single standalone object.
But what if it’s not?
Imagine we drop the cup and break it, then we glue it back together. So the cup is now made of multiple shards. It’s not a standalone object anymore. It’s a composite object, made of several sub objects - much like a car.
So what if we think of the unbroken cup also as a composite object? To us as humans it feels like a single standalone object, but does our brain really represent the cup as a standalone object? Guessing… probably not.
Let’s consider this idea for a moment. If the first layer of LMs can identify the parts of the object (for a cup that would be: handle, body, base, rim), then send that information to higher level LMs where their relative location to one another can be processed…
And say the higher level LMs can store a “range” of relative distances between the parts of the object (instead of storing a transformation map)…
For example, the higher LM can store the distance between the cup handle and the cup body as 0 - 5 centimeters. Then it can store the distance between the round rim to the base as 6 - 12 centimeters.
What can this do? I think this type of setup could detect most transformation of a cup, and still identify it as a cup. It wouldn’t be able to detect a specific version of a cup (unique cup) but it can detect the category of the object as a “cup category”.
And for unique objects that are in the real world, it can store the exact distances between the different parts of the cup. That way it could both identify unique objects, and also other objects in that category when seeing them for the first time.
Also, I’d like to challenge the idea that morphology is the only thing that matters. Currently we are thinking about “objects at location”. But I think instead maybe a different approach to consider would be “mass at location”. (By “mass” I mean as the one defined by physics, like in “matter”.)
Let’s consider a cloud in the sky… It doesn’t have a fixed shape. It can take any shape. It can even take the shape of a cup if we’re lucky enough haha… So a cloud is just matter, just mass, just water vapor in a higher concentration.
Or we can consider the air itself. It’s invisible, we can’t see air but we feel the wind. If we close our eyes and blow into our palm, we can feel the air pressure on our skin. But we know that the pressure doesn’t mean an object is present there. (Imagine a robot with touch sensors being placed outside in high winds… the robot will think it’s touching objects when in fact it’s only sensing the air pressure on its touch sensors.)
So I’m saying that certain things can be defined by their morphology. But other things cannot.
And I think one of those things is the t-shirt. I believe a t-shirt is not defined only by its shape, but also by its matter type (the material it’s made of - fabric). I could cut a large piece of paper into the shape of a t-shirt, and the AI will think it’s a t-shirt (based on the shape), when in fact it’s not.
Well, my hope is that you will consider these ideas. And if not, I hope that just thinking about these can spark other ideas at least.