Brainstorm 5: joining up the dots

I promised myself I’d blog about my thoughts, even if I don’t really have any and keep going round in circles. Partly I just want to document the creative process honestly – so this includes the inevitable days when things aren’t coming together – and partly it helps me if I try to explain things to people. So permit me to ramble incoherently for a while.

I’m trying to think about associations. In one sense the stuff I’ve already talked about is associative: a line segment is an association between a certain set of pixels. A cortical map that recognizes faces probably does so by associating facial features and their relative positions. I’m assuming that each of these things is then denoted by a specific point in space on the real estate of the brain – oriented lines in V1 and faces in the FFA. In both these cases there are several features at one level, which are associated and brought together at a higher level. A bunch of dots maketh one line. Two dark blobs and a line in the right arrangement maketh a face. A common assumption (which may not be true) is that neurons do this explicitly: the dendritic field of a visual neuron might synapse onto a particular pattern of LGN fibres carrying retinal pixel data. When this pattern of pixels becomes active, the neuron fires. That specific neuron – that point on the self-organizing map – therefore means “I can see a line at 45 degrees in this part of the visual field.”

But the brain also supports many other kinds of associative link. Seeing a fir tree makes me think of Christmas, for instance. So does smelling cooked turkey. Is there a neuron that represents Christmas, which synapses onto neurons representing fir trees and turkeys? Perhaps, perhaps not. There isn’t an obvious shift in levels of representation here.

Not only do turkeys make me think of Christmas, but Christmas makes me think of turkeys. That implies a bidirectional link. Such a thing may actually be a general feature, despite the unidirectional implication of the “line-detector neuron” hypothesis. If I imagine a line at 45 degrees, this isn’t just an abstract concept or symbol in my mind. I can actually see the line. I can trace it with my finger. If I imagine a fir tree I can see that too. So in all likelihood, the entire abstraction process is bidirectional and thus features can be reconstructed top-down, as well as percepts being constructed/recognized bottom-up.

But even so, loose associations like “red reminds me of danger” don’t sound like the same sort of association as “these dots form a line”. A line has a name – it’s a 45-degree line at position x,y – but what would you call the concept that red reminds me of danger? It’s just an association, not a thing. There’s no higher-level concept for which “red” and “danger” are its characteristic features. It’s just a nameless fact.

How about a melody? I know hundreds of tunes, and the interesting thing is, they’re all made from the same set of notes. The features aren’t what define a melody, it’s the temporal sequence of those features; how they’re associated through time. Certainly we can’t imagine there being a neuron that represents “Auld Lang Syne”, whose dendrites synapse onto our auditory cortex’s representations of the different pitches that are contained in the tune. The melody is a set of associations with a distinct sequence and a set of time intervals. If someone starts playing the tune and then stops in the middle I’ll be troubled, because I’m anticipating the next note and it fails to arrive. Come to that, there’s a piano piece by Rick Wakeman that ends in a glissando, and Wakeman doesn’t quite hit the last note. It drives me nuts, and yet how do I even know there should be another note? I’m inferring it from the structure. Interestingly, someone could play a phrase from the middle of “Auld Lang Syne” and I’d still be able to recognize it. Perhaps the tune is represented by many overlapping short pitch sequences? But if so, then this cluster of representations is collectively associated with its title and acts as a unified whole.

Thinking about anticipating the next note in a tune reminds me of my primary goal: a representation that’s capable of simulating the world by assembling predictions. State A usually leads to state B, so if I imagine state A, state B will come to mind next and I’ll have a sense of personal narrative. I’ll be able to plan, speculate, tell myself stories, relive a past event, relive it as if I’d said something wittier at the time, etc. Predictions are a kind of association too, but between what? A moving 45-degree line at one spot on the retina tends to lead to the sensation of a 45-degree line at another spot, shortly afterwards. That’s a predictive association and it’s easy to imagine how such a thing can become encoded in the brain. But Turkeys don’t lead to Christmas. More general predictions arise out of situations, not objects. If you see a turkey and a butcher, and catch a glint in the butcher’s eye, then you can probably make a prediction, but what are the rules that are encoded here? What kind of representation are we dealing with?

“Going to the dentist hurts” is another kind of association. “I love that woman” is of a similar kind. These are affective associations and all the evidence shows that they’re very important, not only for the formation of memories (which form more quickly and thoroughly when there’s some emotional content), but also for the creation of goal-directed behavior. We tend to seek pleasure and avoid pain (and by the time we’re grown up, most of us can even withstand a little pain in the expectation of a future reward).

A plan is the predictive association of events and situations, leading from a known starting point to a desired goal, taking into account the reward and punishment (as defined by affective associations) along the route. So now we have two kinds of association that interact!

To some extent I can see that the meaning of an associative link is determined by what kind of thing it is linking. The links themselves may not be qualitatively different – it’s just the context. Affective associations link memories (often episodic ones) with the emotional centers of the brain (e.g. the amygdala). Objects can be linked to actions (a hammer is associated with a particular arm movement). Situations predict consequences. Cognitive maps link objects with their locations. Linguistic areas link objects, actions and emotions with nouns, verbs and adjectives/adverbs. But there do seem to be some questions about the nature of these links and to what extent they differ in terms of circuitry.

Then there’s the question of temporary associations. And deliberate associations. Remembering where I left my car keys is not the same as recording the fact that divorce is unpleasant. The latter is a semantic memory and the former is episodic, or at least declarative. Tomorrow I’ll put my car keys down somewhere else, and that will form a new association. The old one may still be there, in some vague sense, and I may one day develop a sense of where I usually leave my keys, but in general these associations are transient (and all too easily forgotten).

Binding is a form of temporary association. That ball is green; there’s a person to my right; the cup is on the table.

And attention is closely connected with the formation or heightening of associations. For instance, in Creatures I had a concept called “IT”. “IT” was the object currently being attended to, so if a norn shifted its attention, “IT” would change, and if the norn decided to “pick IT up”, the verb knew which noun to apply to. In a more sophisticated artificial brain, this idea has to be more comprehensive. We may need two or more ITs, to form the subject and object of an action. We need to remember where IT is, in various coordinate frames, so that we can reach out and grab IT or look towards IT or run away from IT. We need to know how big IT is, what color IT is, who IT belongs to, etc. These are all associations.

Perhaps there are large-scale functional associations, too. In other words, data from one space can be associated with another space temporarily to perform some function. What came to mind that made me think of this is the possibility that we have specialized cortical machinery for rotating images, perhaps developed for a specific purpose, and yet I can choose, any time I like, to rotate an image of a car, or a cat, or my apartment. If I imagine my apartment from above, I’m using some kind of machinery to manipulate a particular set of data points (after all, I’ve never seen my apartment from above, so this isn’t memory). Now I’m imagining my own body from above – I surely can’t have another machine for rotating bodies, so somehow I’m routing information about the layout of my apartment or the shape of my body through to a piece of machinery (which, incidentally, is likely to be cortical and hence will have self-organized using the same rules that created the representation of my apartment and the ability to type these words). Routing signals from one place to another is another kind of association.

Language is interesting (I realize that’s a bit of an understatement!). I don’t believe the Chomskyan idea that grammar is hard-wired into the brain. I think that’s missing the point. I prefer the perspective that the brain is wired to think, and grammar is a reflection of how the brain thinks. [noun][verb][noun] seems to be a fundamental component of thought. “Janet likes John.” “John is a boy.” “John pokes Janet with a stick.” Objects are associated with each other via actions, and both the objects and actions can be modulated (linguistically, adverbs modulate actions; adjectives modify or specify objects). At some level all thought has this structure, and language just reflects that (and allows us to transfer thoughts from one brain to another). But the level at which this happens can be very far removed from that of discrete symbols and simple associations. Many predictions can be couched in linguistic terms: IF [he] [is threatening] [me] AND [I][run away from][him] THEN [I][will be][safe]. IF [I][am approaching][an obstacle]AND NOT ([I][turn]) THEN [I][hurt]. But other predictions are much more fluid and continuous: In my head I’m imagining water flowing over a waterfall, turning a waterwheel, which turns a shaft, which grinds flour between two millstones. I can see this happening – it’s not just a symbolic statement. I can feel the forces; I can hear the sound; I can imagine what will happen if the water flow gets too strong and the shaft snaps. Symbolic representations and simple linear associations won’t cut it to encode such predictive power. I have a real model of the laws of physics in my head, and can apply it to objects I’ve never even seen before, then imagine consequences that are accurate, visual and dynamic. So at one level, grammar is a good model for many kinds of association, including predictive associations, but at another it’s not. Are these the same processes – the same basic mechanism – just operating at different levels of abstraction, or are they different mechanisms?

These predictions are conditional. In the linguistic examples above, there’s always an IF and a set of conditionals. In the more fluid example of the imaginary waterfall, there are mathematical functions being expressed, and since a function has dependent variables, this is a conditional concept too. High-level motor actions are also conditional: walking consists of a sequence of associations between primitive actions, modulated by feedback and linked by conditional constructs such as “do until” or “do while”.

So, associations can be formed and broken, switched on and off, made dependent on other associations, apply specifically or broadly, embody sequence and timing and probability, form categories and hierarchies or link things without implying a unifying concept. They can implement rules and laws as well as facts. They may or may not be commutative. They can be manipulated top-down or formed bottom-up… SOMEHOW all this needs to be incorporated into a coherent scheme. I don’t need to understand how the entire human brain works – I’m just trying to create a highly simplified animal-like brain for a computer game. But brains do some impressive things (nine-tenths of which most AI researchers and philosophers forget about when they’re coming up with new theories). I need to find a representation and a set of mechanisms for defining associations that have many of these properties, so that my creatures can imagine possible futures, plan their day, get from A to B and generalize from past experiences. So far I don’t have any great ideas for a coherent and elegant scheme, but at least I have a list of requirements, now.

I think the next thing to do is think more about the kinds of representation I need – how best to represent and compute things like where the creature is in space, what kind of situation it is in, what the properties of objects are, how actions are performed. Even though I’d like most of this to emerge spontaneously, I should at least second-guess it to see what we might be dealing with. If I lay out a map of the perceptual and motor world, maybe the links between points on this map (representing the various kinds of associations) will start to make sense.

Or I could go for a run. Yes, I like that thought better.

About stevegrand
I'm an independent AI and artificial life researcher, interested in oodles and oodles of things but especially the brain. And chocolate. I like chocolate too.

16 Responses to Brainstorm 5: joining up the dots

  1. Vegard says:


    Some of my thoughts while reading your post:

    Reciting the alphabet is a lot easier than reciting the alphabet in reverse. If it is true that each letter is associated with the next, then I believe we can infer that these associations are fundamentally one-way.

    Not sure if you’ll buy into this, but I believe that the same is true of songs — hearing the first 4 or 5 notes of a song should generally be enough to trigger the recognition of that song. Actually, it probably doesn’t have to be the very first notes either. Here’s the punchline: If you heard the _last_ 4 or 5 notes of a song in the reverse order (starting from the very last note), there’s probably no way you would recognise which song it was.

    Counting makes for a nice counter-example (no pun intended! For real!). Give me any (reasonably small) number and I’ll count up or down from that, no problem. But it could be that these are actual calculations and not so much associations.

    Speaking of numbers, I generally find it hard to think of a number without actually “pronouncing” it inside my head. In other words, I believe that my internal representation of numbers is tightly coupled to the sequence of muscle movements that occur when I think of/pronounce a number.

    I did a sort of informal experiment where I counted out loud in Norwegian (as fast as I could) from 1 to 100 and measured the time it took. It took approximately 39 seconds. Then I counted, again in Norwegian, but this time silently (no sound and no muscle movement). It took 39 seconds. This is perhaps not so surprising, I think we all know how to “speak” without moving or making a sound. Now comes the interesting part:

    I tried to force myself to not “say” the numbers silently, but “see” them with my eyes closed (so to picture their shape). This was a lot harder, and I spent an entire 3 minutes and 25 seconds to get through all the 100 numbers.

    For the record, counting out loud in English took me around 60 seconds, I assume it’s simply because I’m not used to speaking/counting in English (and probably not because Norwegian words for numbers are shorter). But the fascinating part here is that I can clearly count in two different languages, but counting efficiently for me really seems to require a language!

    Okay, that was a long detour. Now for something completely different: You talk about projections and rotations. I think that this is something that largely requires a conscious effort. Try reading upside-down writing, for example. I can’t do it unless I look at one and one letter to decide what letter it really is, instead of recognising whole words (or maybe even several words) at a time.

    (Of course, you could probably train yourself to recognise words-at-a-time upside down, but the fact that such training is needed means that it is not something that we can do simply because we know how to read in the normal orientation).

    Another experiment. Oh, and this was probably inspired once upon a time by your left-handedness described so funnily in Creation: Life. I taught myself to write in reverse with the right hand (I’m right handed). I mean in reverse as if looking at the writing in a mirror. It’s surprisingly easy. Required almost no training at all for me at least. Try it. On the other hand (ugh, those pesky puns), trying to write either normal letters or reversed letters with the left hand was a total disaster and I still can’t do any of them.

    That’s probably enough for one post. Feel free to give a sign if you feel that my comments are adding more to the noise than to the signal, it can sometimes be hard to judge.

    • stevegrand says:

      Heh! No, that’s ALL signal. Thanks!

      I deduce from your sequencing experiments that bidirectional associations are easily possible but each direction has to be learned separately. That’s interesting. I wonder if that’s consistently true across all levels of association, not just symbols?

      Love the counting experiment! I’ve just tried and I can barely, if at all, count visually. I think I was still subliminally saying the words to myself. I guess the sequence is therefore a sequence of words, rather than a sequence of actual integers? But Ramachandran did some interesting work on number synesthesia that suggested a more visual representation of number. I’ll have to think about that. For the record, I can count from 1 to 100 in Norwegian in infinity seconds!

      Rotational invariance is an interesting thing. We have a lot more trouble recognizing letters and faces from unusual angles than we do other common objects. This may be that letters and faces are handled in a special way (as some think) or perhaps it’s just that we’re far more likely to have seen other objects from a wide range of angles. In support of this is some evidence that the part of our brain that recognizes body parts is not only adjacent to the part that recognizes faces, but is also more active when the body parts are in the bottom half of the visual field (which is where we more often see them, given that we tend to look at people’s faces). Evidence against would be the fact that I’m sure I’d have no difficulty recognizing, say, a lamp post from upside-down and I doubt I’ve ever seen one from that angle. There again, letters and faces don’t just need to be recognized as “a face” or “a letter” – we need to work out which letter or person. That requires more complex analysis – I don’t suppose I could differentiate between familiar and unfamiliar lamp posts when they’re upside-down either!

      It’s fascinating how easy it is to do mirror-writing. That’s a good point. It means the associations that make up the actions in writing can be reversed, almost without effort. How on earth does that work??? The flexibility the brain shows towards direction and scale is the biggest mystery of them all, to me.

      Thanks for the insights.

      • Vegard says:

        The original purpose of the counting experiment was to see if I could learn to count faster if I could recall a picture at a time rather than spelling out each number and spending as much time counting silently as I would if counting out loud.

        (Actually, the whole story is that I was learning how to juggle three balls, and counting the number of throws that I could make. Especially the numbers 30-39 are long and difficult in Norwegian: tretti-en, tretti-to, tretti-tre, the tongue twists on the t. It’s certainly the case in English too, for me at least. And if I couldn’t finish saying or thinking the number before the next throw, well, then the balls would just fall to the floor. Of course, now I just use a simpler method of just counting modulo 10 and saying 10, 20, 30, 40, etc. instead of 0.)

        You said you had problems counting visually too. You could try this: Say “This is a number” (or just “number”) for each number that you “see”. Of course this doesn’t help you count faster, but using this method I can actually see the numbers without hearing them.

        As for mirror writing, I have one theory as to why it’s rather easy to do. Printing letters in reverse, you need exactly the same vertical (up/down) movements, while the horizontal (left/right) movement is almost a constant speed now just in the opposite direction of the usual. I suppose this also explains why it follows from the fact that I can’t write in the normal direction with the left hand that I can’t write in the reverse direction either. Joined letters are harder to reverse because they require more variation in the vertical movements than printed letters.

      • Frederic says:

        Thanks for sharing your thoughts, Steve. It’s really inspiring!

        About the bidirectional associations…
        I think it depends on HOW we learn an association, whether we build more unidirectional associations or bidirectional ones.
        If I memorize a poem, I won’t be able to easily “rotate” it and speak the words in reversed order.
        But when something really bad happens in a certain situation, I will (probably forever) think about this bad thing, when I’m in the same situation again AND I will think about the situation when something similar happens again. For instance, I was in the hospital for quite some time at the age of 4. There I got a medicine with a distinctive smell. Years (about 15) later, my sister had to use the same medicine, and I instantly was reminded of my own time in the hospital.

        When I think about it…I also can remember the situation in which I had to recite a certain poem.

        When I really learn something, its almost always an association between words and abstract concepts. Associations between feelings, situations and consequences are learned on fly-by. I don’t have to think about it…and these associations are probably vitally important – in contrast to a poem, at least 🙂

        Maybe, the process of conscious learning is completely different from unconscious learning?

        Just my thoughts…hope you can extract something from it.

      • Parmeisan says:

        I made a comment about this yesterday and figured it just wasn’t showing up yet, but I think now it got swallowed. I don’t remember all of what I said, except this:

        * I find it easier to count without thinking the number if I concentrate on drawing the symbol in my mind instead of picturing it whole.
        * It is much easier for me to count without thinking the number if I am saying something else, from “A B C D etc” to “H H H H” – like the “this is a number” but it’s interesting that it seems like it’s really just any distraction at all. Or perhaps it’s that we are so used to saying something as we count?

        I had also noted that when I’m counting in the modulo-ten style (“1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20”) I will get caught up by seven but not twenty. Is it because our brains have been trained to split up numbers by tens anyway? Or is it because of the way that seven has two equally-hard syllables and twenty doesn’t?

      • stevegrand says:

        Hey Laura, sorry the previous comment didn’t stick – I didn’t get an alert about it so it must have just got lost in the ether.

        I feel much the same way about seven, or “sem” as I have to say it if I don’t want to get tripped up. It feels to me like I deal with twenty better because my mind is focusing at that point – I have to think what to put there, whereas the 1 through 9 comes out automatically. I can feel a sense of readiness washing over me for the few digits before 20, 30, etc. and maybe that’s why I don’t trip at those points?

        I think your distinction between seeing and drawing is really interesting. I’d be interested to know if people differ on that, although the people who come to this blog are probably quite self-selecting in terms of how their minds work, so it wouldn’t be a fair experiment. I have this nagging feeling that an important part of how we recognize and understand curvature comes from converting the pixels we see into an internal motor action and feeling the acceleration profile it generates. I think it explains our preference for certain kinds of curve – the ones you kind of swoooosh through, as opposed to the ones you have to skid around. I wish I could think of a mechanism for this or experiments to explore it. It’s not necessarily connected with what you’re saying about numbers, but definitely drawing is a motor memory rather than a visual pattern memory. Interesting.

    • Vegard says:

      I just saw a very interesting video of Richard Feynman telling the story of his own counting/timing experiments.

      • stevegrand says:

        Brilliant! What a guy! I agree with him – I’m quite sure we’re all very different inside our heads. Whatever kind of representation Feynman had in his, it turned out to be a damn good one!

        Thanks for the link.

  2. mszola says:

    Just some information…

    noun verb noun is not universal. In Latin, for example, the verb often came LAST, for the very pragmatic reason of kind of keeping the listener attention–gave the sentence more “punch” as it were.

    This was possible because Latin is a highly inflected language, meaning that the part of speech the word represented was easily decoded due to the ending!

    So while in English, “The boy bit the dog” and “The dog bit the boy” have very different meanings, in Latin, the word order was just as likely to be “The boy the dog bit” and the words’ cases let you know who was doing the biting and who was being bitten.

    There is some evocative information about the use of paired words, which apparently is darned near universal when it comes to naming things, including ourselves:

    Anyhow… what I was sort of musing about to myself was that maybe rotational invariance isn’t what we think it is but rather a combination of things, which would explain why it’s been so difficult to figure out.

    What if in the initial visual processing stage, the object’s orientation doesn’t matter?

    I’m struggling to find words for this, bear with me.

    Take a face as an example. Two dark dots and a line in simple linear form. If you take that image and rotate it, there are still two dark dots and a line, if you see what I mean. If there’s an area in the brain that “lights up” when two dark dots and a line are in a specific relationship to each other, then likely it’s going to light up the brain no matter which way it’s oriented because the relationship between the three parts remains the same and it’s still two dots and a line no matter how you rotate it.

    Maybe that sort of “primes” the mammalian brain to then apply other criteria to narrow down the identification?

    Now that I’ve said that, I found another cool (and short!) article that says that apparently newborns don’t prefer faces, they prefer top-heavy images.

    Click to access Turati2004.pdf

    I’m not saying that rotational invariance doesn’t exists or come into play, mind, I’m just saying that maybe on the base level, it’s just registering the shapes and only their relationship to each other matter, then that data gets further sifted.

    Sort of like the frog with its “big moving shape, hop away, small moving shape, stick out your tongue”. The rotation doesn’t matter, just the fact that there’s something moving does.

    I don’t know if that makes any sense, it’s late and I’m tired, but I hope you find the article interesting.

  3. Tym says:


    Invariance with scale and rotation.

    I’ve been modelling the cortex for donkeys years and here’s a simple digital work around to how I think the visual cortex achieves this at the junction between v1 and v2.

    This is gonna be difficult without diagrams but I’ll give it a shot. Firstly it’s how you map the machines view of the world. Don’t use a square matrix. Use a circular polar plot. So rather than say a 50×50 grid start with a small circle with 50 points around its circumference. Outside that a larger circle of 50 points and so on up to say, 50 circles, all aligned with point 0 on a common axis. Use these vector points to gather your edge and RGB data. The same vector/neurons can also be used for priming servo maps etc.

    Open the polar plot up to a 50×50 grid. This gives peripheral vision on one side and small fine detail from the fovea at the other. Now convert into temporal data. The easiest way is to slowly move all points toward the peripheral end of the view until the end of that frame. I replace the data coming across the plot in real time and do some pre-processing on it. Place your 50 neurons on the edge so they are receiving the constant stream of data. Its just as easy for them to recognise temporal patterns at this location (and you need less. Bonus!).
    This helps negate both the scale and the rotational problem of the world view. A object centered view at any scale or world rotation will arrive at these 50 neurons with a similar pattern every time.

    As you can imagine this gives several extra advantages, such as the time it takes for a recognised outline of known size to reach the peripheral border correlates to the distance from the camera, etc.

    Hope that makes sense, been coding all day and my brains fried, lol.

    I’m thoroughly enjoying keeping up with your blog.

    • stevegrand says:

      Hi Tym,

      Ooh! That sounds quite similar to the method I devised for my Lucy robot – that was also hypothetically in V2 – except I had the data slide concentrically inwards. It produced a spatiotemporal pattern that was scale and rotation invariant (and I’d already reduced location invariance in “V1” by saccading to within an object’s boundary – as you seem to have done too). But there was a large amount of ambiguity because of the number of different images with the same resultant pattern, so I gave up. My robot could tell apples from bananas but that was about it. Your method sounds somewhat different (and biologically more authentic) and maybe you don’t have that problem. Interesting that we seem to have converged on the same approach, though! Do you have any longer description of it that I can read? I’ve not written mine up, apart from a loose description in my second book (“Growing up with Lucy”).

      – Steve

  4. John Harmon says:

    A very interesting discussion on associations, experience, and brain function — thanks Steve and everybody else!

    I have a few ideas on attention I would like to offer — I hope they are useful or at least interesting. (A lot of these ideas have already been put forth, but I would like to re-state them in order to provide a proper context for the discussion).

    A general observation on attention is that (1) there are different types of attention, and (2) each type of attention is goal-related in some way. One basic type of attention is when an agent purposefully focuses on an immediate perceptual experience. For example, if an agent (human, virtual human, animal, robot…) chooses to attend to a particular object in its visual field, it is because a goal of some kind is associated with the experience of that object. A moving object for example could be associated with a bodily threat, a food source, or a potential friend or mate. Therefore the appearance of an autonomous moving object would first trigger the goal “identify that object.” (The more general goal would be something like “identify unknown objects near one’s body that are capable of locomotion.”) As a result of the activation of this goal, the agent would orient itself toward the object, and the object’s visual characteristics would be emphasized within the agent’s overall experience.

    (Note that attention involves both an agent’s desired experience + the self actions — either motor action or thinking — that creates that experience.)

    This moving object could also trigger goals that are associated not only with the agent’s current perceptual experience, but with its future planning. For example, the identification of this moving object as a “bodily threat” could trigger memories related to a “bodily threat” object, including effective hiding places or effective escape actions. In this case, the agent is focusing its attention not only to its current perceptual experience, but also to memories separate from immediate perceptual experience — specifically predicted future scenerios (being chased) and desired responses (hiding, fleeing). In this case, the overall goal driving this hybrid perceptual/cognitive attention is something like “avoid bodily harm.”

    As has been pointed out, attention can also be completely disconnected from current experience, in which case thinking is occuring. Actually, I shouldn’t say completely disconnected. An agent capable of protecting itself would have a constant low level/low energy connection with its environment, in case an object appeared in its proximity that posed a threat. All animals have this connection — the sensory stream emanating from their sensory organs. (This is why if a sudden loud noise occured, or an object fell from the sky and landed in front of you, it would grab your attention). But as long as the analog signals from its environment said “everything is a-ok — no threats nearby” then the agent would be free to focus most of their attention to their memories that are not associated with their current experience, and to combine those memories in new ways. The general goal driving cognitive attention is something like “combine memories in new ways, for the purpose of achieving one or more cognitive goals.”

    Finally, attention can be created bottom-up by an experience that triggers high-energy memories. For example, seeing a $100 bill on the ground would tend to create attention, because this perceptual object has associated with it the high-energy memories of being able to pay one’s bill’s, buy food, buy a new computer, etc. Attention in this case is triggered by an experience and not by a goal. However, without the agent’s prior goals that have driven its past attention, and acquisition of past experience, such as the experience of “paying my bills”, “eating,” and “working efficiently,” the 100 dollar bill object would not have acquired those high-energy associations in the first place. Therefore even bottom-up attention is associated with goals, indirectly.

    Looking at the system of brain function more generally, there are 2 major circuits that are constantly active within an agent’s brain. The first is the top down circuit: motivation → goals → attention → selection of perceptual or cognitive experience to emphasize (and the motor actions that create this experience). The second is the bottom-up circuit: environmental stimuli (in the case of vision, electromagnetic radiation waves of various frequencies and intensities) → sensory organ signals → low level perceptual experience → higher level and more generalized memories/associations. Both circuits are operating continually, moment by moment, throughout a person’s day.

    In my view, whatever associations are activated with the highest level of energy (either bottom-up or top-down) will be the associations that are attented to, and experienced, most strongly. This relates to Steve’s concept of an agent selecting an “IT” to pay attention to. The more IT’s that can be attended to simultaneously, the more multifaceted experience can be, the more learning (i.e. the formation of new associations) that can occur, and the more goals that can be achieved simultaneously as well.

    Just my two cents on attention, and on brain function in general… will try to be more brief in future posts 😉

  5. Rafael C.P. says:

    Hi, Steve! Just started to follow your RSS feeds and received this great post. I’m working with hierarchical perceptual representations, association and control through inverse models and reinforcement learning, and hope to apply it to a game in the future too! I knew about Creatures before knowing about you, but, unfortunately, I still haven’t played it.
    Did you read Jeff Hawkins’ “On Intelligence” book and/or his HTM works at Numenta? Maybe it can give you some more ideas about representation, associations and invariances.
    I believe that hierarchical temporal clustering algorithms can be very useful as an early step in perceptual processing in general, as well as using polar coordinates for vision as you and Tym already pointed out. Fourier-Mellin transform may be useful as well.
    Hope to have added at least 1 cent 🙂

    • stevegrand says:

      Hi Rafael! Thanks. Yes, Jeff sent me a copy of his book a few years ago (and I returned the favor) because he could see we were thinking along similar lines. I’m a bit out of touch with the HTM though – I got the feeling they’d decided to focus on this one aspect and ignored all the other aspects of intelligence, so we seemed to be going off in different directions again. I’d not heard of the Mellin Transform. I knew about 2D FFTs of course, but this extras step is very interesting. Whether it’s biologically plausible is another matter (that’s important to me) but I’ll take a closer look. Thanks – it’s all very helpful. Good luck with your own project!

      • Rafael C.P. says:

        Agreed, they’re focusing only on perception now (at least it’s a very complex perception processing, which is good).

        About the biological plausibility of FMT, I think it’s there, at least to some extent! You can see the log-polar transform as the very procedure you and Tym were discussing: Polar coordinates with more detail next to the center (fovea), which corresponds to the way our eyes are structured! And the FFT would be just what our neurons do when converting input signals into different spike frequencies! Ok, but FMT uses 2 FFTs, 1 of them which is applied before the log-polar transform… and that’s beyond my arguments hehe.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: