Animal Vision vs Machine Vision: Understanding How Nature Sees the World

Author

Ganesh Venkatraman

Animal View Through Machine Vision

At its core, machine vision uses vast amounts of data to distinguish objects within images, identify their defining attributes, and extract features that uniquely characterize each object. Attributes from various perspectives are identified and trained as the object. Features extracted from objects in new images are matched to a set of classified images, identified based on the degree of fit (as a percentage match). Whether it’s facial expressions, tumor characteristics, material defects, or zebra crossings, the process of feature extraction and object classification is guided by the specific target of identification. The subsequent course of action is then determined by the identified objects.

Natural Vision vs. Machine Vision

Natural vision operates differently from machine vision. The human brain, for instance, is constantly bombarded with a vast amount of sensory information, including images. Typically, we remember very little from the continuous stream of visual impressions. Our brain retains a visual impression for just 1/30th of a second before moving to the next, creating a seamless, coherent image of our overall surroundings. However, in specific situations, the brain retains objects with vivid details, such as a witness providing a detailed description to a sketch artist after a crime. These images are processed and analyzed in a complex hierarchy of cortical and subcortical regions to form a coherent picture seamlessly. The way the brain forms visual memories from incidents, whether traumatic or pleasurable, and separates them with boundaries and timelines, has implications for memory disorders such as Alzheimer’s disease.

Various aspects of images are processed in distinct parts of the brain in bits and pieces. This includes processing the object's outline, orientation, color, depth, motion, shape, and recognition through multiple layers of processing. Despite the complexities, the brain retains a series of images in a continuum with no latency, forming a coherent picture. The intriguing question is how the brain extracts details of the image, integrates these bits, and pieces of information, and collates them with the overall background to form a perfectly coherent, continuous, and extraordinarily stable picture. It is akin to reassembling the fragments of a broken designer vase back into its original, seamless state.

Understanding the Binding Problem

There are two schools of thought regarding how the brain processes visual information. One theory suggests that the brain extracts basic features such as orientation, depth, and color, passes them through mid-level processing, and then into higher-level processing for category and object identification. This process raises the question: how does the brain synthesize attributes like orientation, color, motion, and shape, belonging to numerous objects in the line of vision? This synthesis is akin to a convolutional network in reverse. Instead of feature extraction and segmentation, the brain performs feature aggregation and integration to recreate the observed image.

Another school of thought postulates that the brain treats the entire image as a single object composed of numerous sub-objects. This approach suggests that the brain can observe an image holistically without any feature extraction, segmentation, or identification, until something triggers it to focus on a specific area, making the image properties relevant. This addresses the binding problem, which is how the brain combines various sensory inputs to form a unified perception of an object. Despite the distributed processing of different attributes in separate brain regions, the brain seamlessly integrates these features into an extraordinary visual experience.

Animal Vision Insights

When discussing vision, we typically refer to the visual spectrum of the electromagnetic wave. However, different species have unique adaptations:

Bees and reindeer can process the ultraviolet spectrum.
Nocturnal animals have adaptations for night vision.
Bats form images determining size, shape, and texture through ultrasound echolocation.
Dogs, bears, and mice possess an extraordinary sense of smell.
Sharks detect distinct electrical signals from their surroundings.

Overall, the species surrounding us have extraordinary senses to extract features, isolate, and detect objects for the brain to process and direct its motor functions.

The brain's focus mechanisms, driven by desires and avoidances, and its ability to process detailed image information through layers of memory, play crucial roles in behavior and perception.

Conclusion

Computer vision is just a sliver of what nature has evolved over billions of years. For example, the vision of flying spiders offers fascinating insights into how these creatures perceive the world. What Jumping Spiders Teach Us About Color illustrates the complexity and sophistication of natural vision.

While computers can only process the visual aspects of an image, living beings perceive not only the image but also the object in its entirety, incorporating context and multiple sensory inputs.

Our understanding of nature’s vision, complemented by advancements in AI and machine vision, can propel us toward a better future, enhancing both technology and our appreciation of the natural world.

Share to: