首页 | 本学科首页   官方微博 | 高级检索  
     


Encoding of 3D physical dimensions by face-selective cortical neurons
Authors:Amit P. Khandhadia  Aidan P. Murphy  Kenji W. Koyano  Elena M. Esch  David A. Leopold
Affiliation:aSection on Cognitive Neurophysiology and Imaging, Laboratory of Neuropsychology, National Institute of Mental Health, NIH, Bethesda, MD 20892;bNeurophysiology Imaging Facility, National Institute of Mental Health, National Institute of Neurological Disorders and Stroke, National Eye Institute, NIH, Bethesda, MD 20892;cThe University of Colorado Anschutz Medical Campus, Medical Scientist Training Program, Aurora, CO 80045
Abstract:Neurons throughout the primate inferior temporal (IT) cortex respond selectively to visual images of faces and other complex objects. The response magnitude of neurons to a given image often depends on the size at which the image is presented, usually on a flat display at a fixed distance. While such size sensitivity might simply reflect the angular subtense of retinal image stimulation in degrees, one unexplored possibility is that it tracks the real-world geometry of physical objects, such as their size and distance to the observer in centimeters. This distinction bears fundamentally on the nature of object representation in IT and on the scope of visual operations supported by the ventral visual pathway. To address this question, we assessed the response dependency of neurons in the macaque anterior fundus (AF) face patch to the angular versus physical size of faces. We employed a macaque avatar to stereoscopically render three-dimensional (3D) photorealistic faces at multiple sizes and distances, including a subset of size/distance combinations designed to cast the same size retinal image projection. We found that most AF neurons were modulated principally by the 3D physical size of the face rather than its two-dimensional (2D) angular size on the retina. Further, most neurons responded strongest to extremely large and small faces, rather than to those of normal size. Together, these findings reveal a graded encoding of physical size among face patch neurons, providing evidence that category-selective regions of the primate ventral visual pathway participate in a geometric analysis of real-world objects.

We experience the world in three-dimensional (3D) space, perceiving and interacting with objects and individuals in a scene. For humans and other primates, much of this experience is served by vision, with broad stretches of the cerebral cortex ostensibly devoted to making visual sense of the world. For example, individual neurons throughout the inferior temporal (IT) cortex of the macaque respond selectively to meaningful objects, with neurons of similar response properties often aggregated in functional clusters (13). One striking finding about the object selectivity of IT neurons is its tolerance to natural image transformations, such as scaling and translation (412). Namely, if stimuli are ranked based on the responses they elicit from a given neuron, this ranking often remains unchanged when stimuli are translated on the screen or scaled up or down several-fold in size. Scale tolerance in object selectivity is thought to reflect the capacity of the brain to compute a conceptual or abstracted representation of the retinal image separate from its metric details. While the mechanism underlying this apparently intrinsic feature of ventral stream visual processing is poorly understood, it is thought to be critical for image-based object recognition (4, 1317).At the same time, scaling an image up or down can greatly change the responses of IT neurons to stimuli, even as the rank-order selectivity to stimuli is preserved (1820). This size-dependent rate modulation is poorly understood and seldom considered explicitly. One relatively unexplored possibility is that some IT neurons encode the physical dimensions of objects, in addition to their shape, and their featural, and semantic properties. The explicit coding of parameters such as absolute object size and distance from the observer might facilitate visual operations concerned with the perception of scene geometry and interaction with the local environment. Additionally, the brain may benefit by retaining internal metric information about the typical sizes of objects (21, 22), as this information could be applied to subsequent perceptual judgments about objects and individuals in the context of natural visual behaviors (23).The visual encoding of 3D space is usually associated with parietal cortex in the dorsal visual pathway, where coordinate transformations are thought to convert retinal signals to 3D information about objects and the environment that can be used to guide effector actions (24). However, a few studies have demonstrated that neurons in the ventral pathway also exhibit signals related to 3D spatial perception. For example, in area V4 neural responses to a given retinal image are modulated based on the physical distance at which that image is presented (25) as well as volumetric 3D shape parameters (26). At later ventral pathway processing stages, the superior temporal sulcus (STS) is marked by selectivity to 3D object shape, potentially reflecting their interplay with intraparietal areas concerned with 3D visual geometry (2732). While these findings demonstrate that 3D information influences responses across the ventral visual pathway, little is known about whether these areas explicitly encode the physical dimensions of objects, such as their size or distance from the observer.Here we explicitly investigate how a population of category-selective neurons in macaque IT encode the physical dimensions of objects. We recorded from the anterior fundus (AF) face patch (33), a well-studied face-selective region of the STS where neurons are known to be both selective for faces and sensitive to their spatial scale (34). We asked whether such scale sensitivity primarily reflects the 2D image of a face on the retina or the 3D physical geometry of the face and head in the real world. In most visual electrophysiology experiments, the retinal and physical geometry of an image are yoked: scaling an image on a display alters both its physical size and its retinal subtense. Moreover, absent other explicit depth cues, an image has ambiguous depth and thus cannot be uniquely mapped to the 3D world. In the present study, we used a recently developed macaque avatar model (35) to stereoscopically render photorealistic 3D faces of unambiguous physical size and distance. We found that the size sensitivity of most AF neurons was dictated primarily by the physical dimensions of a face rather than by its angular subtense on the retina. We further discovered that neural responses were strongest to extreme-sized faces rather than normal sized faces, opposing intuition but consistent with ideas of predictive coding. We discuss how object-selective IT neurons might contribute to important and conserved elements of natural visual behavior through their encoding of real-world geometric parameters.
Keywords:visual objects   inferior temporal cortex   macaque   naturalistic behavior   face processing
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号