Are we too “smart” to understand how we see?

Understanding Vision: Theory, Models, and Data

Buy Now

By Li Zhaoping
August 21^st 2014

About half a century ago, an MIT professor set up a summer project for students to write a computer programme that can “see” or interpret objects in photographs. Why not! After all, seeing must be some smart manipulation of image data that can be implemented in an algorithm, and so should be a good practice for smart students. Decades passed, we still have not fully reached the aim of that summer student project, and a worldwide computer vision community has been born.

We think of being “smart” as including the intellectual ability to do advanced mathematics, complex computer programming, and similar feats. It was shocking to realise that this is often insufficient for recognising objects such as those in the following image.

Fig 5.51 from Li Zhaoping, Understanding Vision: Theory Models, and Data — Fig 5.51 from Li Zhaoping, *Understanding Vision: Theory Models, and Data*

Can you devise a computer code to “see” the apple from the black-and-white pixel values? A pre-school child could of course see the apple easily with her brain (using her eyes as cameras), despite lacking advanced maths or programming skills. It turns out that one of the most difficult issues is a chicken-and-egg problem: to see the apple it helps to first pick out the image pixels for this apple, and to pick out these pixels it helps to see the apple first.

A more recent shocking discovery about vision in our brain is that we are blind to almost everything in front of us. “What? I see things crystal-clearly in front of my eyes!” you may protest. However, can you quickly tell the difference between the following two images?

Inattentional blindness — Alyssa Dayan, 2013 Fig. 1.6 from Li Zhaoping
*Understanding Vision: Theory Models, and Data. Used with permission*

It takes most people more than several seconds to see the (big) difference – but why so long? Our brain gives us the impression that we “have seen everything clearly”, and this impression is consistent with our ignorance of what we do not see. This makes us blind to our own blindness! How we survive in our world given our near-blindness is a long, and as yet incomplete, story, with a cast including powerful mechanisms of attention.

Being “smart” also includes the ability to use our conscious brain to reason and make logical deductions, using familiar rules and past experience. But what if most brain mechanisms for vision are subconscious and do not follow the rules or conform to the experience known to our conscious parts of the brain? Indeed, in humans, most of the brain areas responsible for visual processing are among the furthest from the frontal brain areas most responsible for our conscious thoughts and reasoning. No wonder the two examples above are so counter-intuitive! This explains why the most obvious near-blindness was discovered only a decade ago despite centuries of scientific investigation of vision.

cularSingleton_GazeAttraction — Fig 5.9 from Li Zhaoping, *Understanding Vision: Theory Models, and Data*

Another counter-intuitive finding, discovered only six years ago, is that our attention or gaze can be attracted by something we are blind to. In our experience, only objects that appear highly distinctive from their surroundings attract our gaze automatically. For example, a lone-red flower in a field of green leaves does so, except if we are colour-blind. Our impression that gaze capture occurs only to highly distinctive features turns out to be wrong. In the following figure, a viewer perceives an image which is a superposition of two images, one shown to each of the two eyes using the equivalent of spectacles for watching 3D movies.

To the viewer, it is as if the perceived image (containing only the bars but not the arrows) is shown simultaneously to both eyes. The uniquely tilted bar appears most distinctive from the background. In contrast, the ocular singleton appears identical to all the other background bars, i.e. we are blind to its distinctiveness. Nevertheless, the ocular singleton often attracts attention more strongly than the orientation singleton (so that the first gaze shift is more frequently directed to the ocular rather than the orientation singleton) even when the viewer is told to find the latter as soon as possible and ignore all distractions. This is as if this ocular singleton is uniquely coloured and distracting like the lone-red flower in a green field, except that we are “colour-blind” to it. Many vision scientists find this hard to believe without experiencing it themselves.

Are these counter-intuitive visual phenomena too alien to our “smart”, intuitive, and conscious brain to comprehend? In studying vision, are we like Earthlings trying to comprehend Martians? Landing on Mars rather than glimpsing it from afar can help the Earthlings. However, are the conscious parts of our brain too “smart” and too partial to “dumb” down suitably to the less conscious parts of our brain? Are we ill-equipped to understand vision because we are such “smart” visual animals possessing too many conscious pre-conceptions about vision? (At least we will be impartial in studying, say, electric sensing in electric fish.) Being aware of our difficulties is the first step to overcoming them – then we can truly be smart rather than smarting at our incompetence.

Li Zhaoping obtained her Ph.D. in physics from California Institute of Technology in 1989. She is currently a professor in computational neuroscience in the department of computer science in University College London, where she and her colleagues founded the Gatsby Computational Neuroscience Unit in 1998. Li Zhaoping is author of Understanding Vision: Theory, Models, and Data

Understanding Vision: Theory, Models, and Data

Related posts:

Recent Comments