April 21, 2010 8 Comments
Ye Gods! I’d better get in quickly with a second installment – I’ve already written more words in replies to comments than there were in my first post. Thanks so much to all of you who have contributed comments already – I only posted it yesterday! I really appreciate it and I hope you’ll continue to add thoughts and observations.
Opening up my thought processes like this is a risky and sometimes painful thing to do, and I know from past experience that certain things tend to happen, so I’d like to make a few general observations to forestall any misunderstandings.
Firstly, I know a lot of you have your own ambitions, theories and hopes in this area, and I’ll do what I can to accommodate them or read your papers or whatever. But bear in mind that I can’t please everybody – I have to follow my own path. So if I don’t go in a direction you’d like me to go, I apologize. I’ll try to explain my reasoning but inevitably I’m going to have to make my own choices.
Secondly, I do this kind of work because I believe I have some worthwhile insights already. I’m not desperately looking for ideas or existing theories – the people who invented these ideas are perfectly welcome to write their own games. This is a tricky area, because I like it when someone says “have you thought of doing XXX?” but I’m not so interested in “have you seen YYY theory or ZZZ’s work?” I just don’t work that way – I prefer to think things through from first principles – and I’m writing this game largely to develop my own ideas, rather than with the pragmatic aim of writing a commercial application by bolting together other people’s.
Lastly, I invariably develop software alone. Nobody has offered to help or asked for this to be open source yet, but I know it’s coming. I don’t do collaborations. Collaborations have driven me crazy (and almost bankrupt) in the past. I know there are loads of people who would love to be part of a project like this, but all I can suggest is that you go off together and write one, because it’s not for me. I’m opening it up because I know people find it interesting and I wanted to share the design process, but I’m not interested in working on the actual code with others. It’s just not my thing.
Oh, and I do realize this is ambitious. I know it may not work. But I’m not as naive as I look, either. I’ve written four commercial games and at least a dozen commercial titles in other fields, so I’m pretty competent in terms of software development and product design. And I’ve been working in AI since the late 1970’s. Although it’s only my hobby, strictly speaking, I’m pretty well connected with the academic community and conversant with the state of the art. And I have an existence proof in Creatures, as long as you make allowances for the fact that I started writing it almost two decades ago. So don’t worry that I’m unwittingly being foolish and naive – I already know exactly how foolish I am!
Forgive me for saying these things up front – I really welcome and appreciate everybody’s support, thoughts, criticisms and general conversation. I just wanted to state a few ground rules, because it’s quite emotionally taxing to open up your innermost thought processes for inspection, and the provisional nature of everything can sometimes make it look like I’m floundering when really I’m just trucking along steadily.
Ok, so where to next? The features I mentioned yesterday were all aspects I’d like to see emerging from a common architecture. Jason admonished me to make sure I design a hierarchical brain, in which lower levels (equivalent to the thalamus and the brainstem) are fully functioning systems in their own right, and could be the complete brains of simpler animals as well as the evolutionary foundation for higher brain functions. I think this is important and a good point. The reptilian thalamus/limbic system probably works by manipulating more primitive reflexes in the brainstem. The cortex then unquestionably supervenes over the thalamus (for instance if we deliberately wish to look in a particular direction we quite probably do this by sending signals from the cortex (the frontal eye fields) to the superior colliculi of the thalamus, AS IF they were visual stimuli, thus causing the SC to carry out its normal unconscious duty of orienting the eyes towards a sudden movement). And finally, the prefrontal lobes of the cortex seem to supervene over an already functional set of subconscious impulses, motor and perceptual circuits in the rest of cortex, adding planning, the ability to defer reward, empathy and possibly subjective consciousness to the repertoire. So there are good reasons to follow this scheme myself.
But for now I’d like to think mostly about the cortical layer of the system. This is (perhaps) where memory plays the greatest role; where classification, categorization and generalization occur; and where prediction and the ability to generate simulations arises. I can assume that beneath this there are a bunch of reflexes and servoing subsystems that provide the outputs – I’ll worry about how to implement these later. But somehow I need to develop a coherent scheme for recognizing and classifying inputs and associating these with each other, both freely (as in “X reminds me of Y”) and causally (as in “if this is the trajectory that events have been taking, this is what I think will happen next”). Somehow these predictions need to iterate over time, so that the system can see into the future and ask “what if?” questions.
Let’s think about classification first. The ability to classify the world is crucial. It’s insufficient for intelligence, despite the huge number of neural nets, etc. that are nothing but classifier systems, but it’s necessary.
Here’s an assertion: let’s assume that the cortical surface is a map, such that, for any given permutation of sensory inputs, there will be a set of points on the surface that come to best represent that permutation.
It’s a set of points – a pattern – because I’m assuming this is a hierarchical system. If you hear a particular voice, a set of points of activity will light up in primary auditory cortex and elsewhere, representing the frequency spectrum of the voice, the time signature, the location, etc. Some other parts of auditory cortex will contain the best point to represent whose voice it is, based on those earlier points, or which word they just said. Other association areas deeper in the system will contain the points that best represent the combination of that person’s voice with their face, etc. Perhaps way off in the front there will be a point that best represents the entire current context – what’s going on. Other points in motor cortex represent things you might do about it, and they in turn will activate points lower down representing the muscle dispositions needed to carry out this action. So the brain will have a complex pattern of activation, but it’s reasonable to assert (I think) that EACH POINT ON THE CORTICAL SURFACE MAY BEST REPRESENT SOME GIVEN PERMUTATION OF INPUTS (INCLUDING CORTICAL ACTIVITY ELSEWHERE).
The cortex would therefore be a map of the state of the world. This is a neat assumption to work with, because it has several corollaries. For one thing, if the present state of the world is mapped out as such a pattern, then the future state, or the totally imagined state, or the intended state of the world can simultaneously be mapped out on the same real estate (perhaps using different cells in the same cortical columns). Having such a map allows the brain to specify world state in a variety of ways for a variety of reasons: sensation, perception, anticipation, intention, imagination and attention. Each is a kind of layer on the map, and they can be presumed to interact. So, for instance, the present state and recent past states give rise to the anticipated future state, via memories of probability derived from experience. Or attention can be guided by the sensory map and used to filter the perceptual or motor maps.
A second corollary might be that SIMILAR PERMUTATIONS TEND TO BE BEST REPRESENTED BY CLOSE NEIGHBORS. If this is true, then the system can generalize, simply by having some fuzziness in the neural activity pattern. If we experience a novel situation, it will give rise to activity centered over a unique point, but this point is close to other points representing similar, perhaps previously experienced situations. If we know how to react to them, we can guess that this is the best response to the novel situation too, and we can make use of this knowledge simply by stimulating all the points around the novel one.
When I say these are points on the cortical surface, I mean there will be an optimum point for each permutation, but the actual activity will be much more broad. I have a strong feeling that the brain works in a very convolved way – any given input pattern will activate huge swathes of neurons, but some more than others, such that the “center of gravity” of the activity is over the appropriate optimum point. I showed with Lucy that such large domes of activity can be used for both servoing and coordinate transforms (e.g. to orient the eyes and head towards a stimulus depending on where it is in the retinal field – a transform from retinal to head-centered coordinates). Smearing out the activity in this way also permits generalization, as above. But it’s a bummer to think about, because everything’s blurry and holographic!
I have some nagging issues about all this but for now I’ll run with it. It’s a neat mechanism, and if biology doesn’t work this way then it damn well ought! It’s a good starting point, anyway. Lots of things fall out of it.
And I already have a mechanism that works for the self-organization of primary visual cortex and may be more generally applicable to this “classification by mapping” scheme. But that, and some questions and observations about categories and the collapse of phase space, can wait for next time!
EDIT: Just a little footnote on veracity: I like to be inspired by biology but this doesn’t mean I follow it slavishly. So if I assert that perhaps the cortex acts like a series of overlaid maps, I’ll have done so because it’s plausible and there’s some supportive evidence. But please remember that this is an engineering project – I’m not saying the cortex DOES work like this; only that it’s reasonably consistent with the facts and provides a useful hunch for designing an artificial brain. It’s a way of inventing, not discovering. So sometimes I say cortex and mean the real thing, and sometimes I’m talking about my hypothetical engineered one. I ought to use inverted commas really, but I hope you’ll infer the distinction.