When we interact with our world – regardless if by means of a simple grasp, a smile, or the utterance of a sentence – we are typically quite confident that it was us who intended to and thus executed the interaction. But what is this “us”? Is it something physical or something mental? Is it merely a deterministic program or is there more to it? And how does it develop, that is, how does the mind come into being?
Clearly, there is not a straight-forward answer to these questions. However over the last twenty years or so, cognitive science has offered a pathway towards an answer by pursuing an “embodied” perspective. This perspective emphasizes that our mind develops in our brain in close interaction with our body and the outside environment. Moreover, it emphasizes that we do not passively observe our environment – like the prisoners in Plato’s “Allegory of the Cave” – but that we are actively interacting with it in a goal-directed manner.
Even with such a perspective, though, challenging questions remain to be answered. How do we learn to control our body? How do we learn to reason and plan? How do we abstract from and generalize over our continuously incoming sensorimotor experiences and develop conceptual thoughts?
It appears that key computational and functional principles are involved in mastering these immense cognitive challenges. First of all, bodily features can strongly ease the cognitive burden, offering embodied attractors, such as when sitting, walking, holding something, or when uttering a certain sound. The addition of reward-oriented learning helps to accomplish and maintain the stability of these attractors. Think of the joy expressed by a baby, who has just accomplished the maintenance of a novel stable state for the first time, such as taking the first steps successfully!
How do we learn to control our body? How do we learn to reason and plan?
However, our minds can do even more than expressing reactive, reward-oriented behavior. Clearly, we can think ahead and we can act in anticipation of the expected action effects. Indeed, research evidence suggests that an “image” of the action effects is present while we interact with the world. It seems that this image focusses on the final effect of the action – such as holding an object in a certain way after grasping it. Thus, predictions and anticipations are key components of our minds. Combined with the principle of stability, desired future stable states can be imagined, their realization can be accomplished by goal-oriented motor control, and the actual accomplishment can trigger reward.
How are these principles spelt out in our brains? When considering the brain’s neural structures and their sensory- and motor-connectivity, it may be possible to describe a pathway towards abstract thought. Hierarchical, probabilistic, predictive encodings have been identified, which can be mimicked by computational processes. When allowing interactions between multiple partially redundant, partially complementary sensory information sources, progressively more abstract structures can develop. These include:
- Spatial encodings – such as the space around us as well as cognitive maps of the environment;
- Gestalt encodings of entities – including objects, tools, animals, and other humans;
- Temporal encodings – estimating how things typically change over time.
These encodings are temporarily or permanently associated with particular reward encodings, depending on the experiences.
When reconsidering behavior in this scenario, it becomes very clear that hierarchical structures are also necessary for realizing adaptive and flexible behavior. To be able to effectively plan behavioral sequences in a goal-directed manner, such as drinking from a glass, it is necessary to abstract away from the continuous sensorimotor experiences. Luckily, our continuous world can be segmented in an event-oriented manner. An event-boundary, such as the touch of an object after approaching it, or the perception of a sound wave after a silence, indicates interaction onsets and offsets. Computational measures of surprise can be good indicators to detect such boundaries and segment our continuous experiences.
But what about language? From the perspective of a linguist, the Holy Grail lies in the semantics.
But what about language? From the perspective of a linguist, the Holy Grail lies in the semantics. Although we have no system that is able to build a model of the world’s semantics – except for our brains – yet, computational principles and functional considerations suggest that world semantics may be learned from sensorimotor experiences. Moreover, language seems to fit perfectly onto the learned semantic structures. All this learning happens in a social and cultural context, within which we experience ourselves and others – particularly during social interactions with others, including conversations. In addition, tool usage opens up another perspective onto us as tools, serving a purpose.
Putting the pieces together, we are advancing toward developing artificial cognitive systems that learn to truly understand the world we live in – essentially demystifying the puzzle of our “selves” by computational means. Conversely, a functional and computational perspective onto our “selves” suggests that we are intentional, anticipatory beings that are embodied in a socially-predisposed, understanding-oriented brain-body complex, choosing goals and behaviors in our own best interest – in the search for finding our place in the social and cultural realities we live in.
Featured image credit: Barefoot child by ZaydaC. CC0 public domain via Pixabay.