Unifying account of visual motion and position perception |
| |
Authors: | Oh-Sang Kwon Duje Tadin David C. Knill |
| |
Affiliation: | aCenter for Visual Science and Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, 14627;;bDepartment of Ophthalmology, University of Rochester School of Medicine, Rochester, NY, 14642 |
| |
Abstract: | Despite growing evidence for perceptual interactions between motion and position, no unifying framework exists to account for these two key features of our visual experience. We show that percepts of both object position and motion derive from a common object-tracking system—a system that optimally integrates sensory signals with a realistic model of motion dynamics, effectively inferring their generative causes. The object-tracking model provides an excellent fit to both position and motion judgments in simple stimuli. With no changes in model parameters, the same model also accounts for subjects’ novel illusory percepts in more complex moving stimuli. The resulting framework is characterized by a strong bidirectional coupling between position and motion estimates and provides a rational, unifying account of a number of motion and position phenomena that are currently thought to arise from independent mechanisms. This includes motion-induced shifts in perceived position, perceptual slow-speed biases, slowing of motions shown in visual periphery, and the well-known curveball illusion. These results reveal that motion perception cannot be isolated from position signals. Even in the simplest displays with no changes in object position, our perception is driven by the output of an object-tracking system that rationally infers different generative causes of motion signals. Taken together, we show that object tracking plays a fundamental role in perception of visual motion and position.Research into the basic mechanisms of visual motion processing has largely focused on simple cases in which motion signals are fixed in space and constant over time (e.g., moving patterns presented in static windows) (1). Although this approach has resulted in considerable advances in our understanding of low-level motion mechanisms, it leaves open the question of how the brain integrates changing motion and position signals; when objects move in the world, motion generally co-occurs with changes in object position. The process of generating coherent estimates of object motion and position is known in the engineering and computer vision literature as “tracking” (e.g., as used by the Global Positioning System) (2). Conceptualizing motion and position perception in the broader context of object tracking suggests an alternative conceptual framework—one that we show provides a unifying account for a number of perceptual phenomena.An optimal tracking system would integrate incoming position and motion signals with predictive information from the recent past to continuously update perceptual estimates of both an object’s position and its motion. Were such a system to underlie perception, position and motion should be perceptually coupled in predictable ways. Signatures of such a coupling appear in a number of known phenomena. On one hand, local motion signals can predictively bias position percepts (3–8). On the other hand, we can perceive motion solely from changes in object position (9–12). For example, motion can be perceived in stimuli with no directional motion signal by tracking position changes along a specific direction (10). These phenomena, however, are currently regarded as arising from independent mechanisms (11–14).Given the interdependency of motion and position and the inherent noisiness of sensory signals, it is advantageous for vision to exploit the redundancy between motion and position signals and integrate them into coupled perceptual estimates. This is complicated by the fact that local motion signals can result from a combination of motions (of which object translations are only one) (15, 16). A flying, rotating soccer ball provides a prototypical example of this problem (). Because the ball rotates as it flies through the air, the local retinal motion signals created by ball texture are sums of two world motions: translation and rotation of the ball. Relating local motion signals to object motion requires the solution of the “source attribution” problem (17, 18)—determining what part of a local retinal motion pattern is due to object translation and what part is due to object-relative motion of the texture pattern. To solve this attribution problem, the brain can exploit the redundant information provided by the changing stimulus position. Moreover, integrating motion and position information over time with an internal model of motion dynamics can mitigate both the uncertainty created by ubiquitous sensory noise (19) and that created by the motion source attribution problem. Although object-relative pattern motion is not a property of all moving objects, understanding how pattern motion interacts with object motion and position can help elucidate how the brain integrates motion and position signals into coherent perceptual estimates—a problem associated with all moving objects.Open in a separate windowSchematic illustration of the object-tracking model and its behavior. (A) An example of an object with both object boundary motion and pattern motion. (B) A generative model of the Bayesian observer. White nodes indicate hidden variables and gray nodes indicate observable variables that are noisy measurements of the connected hidden variables. Arrows indicate causal links. (C) Model behavior for a typical MIPS stimulus containing a moving pattern within a static envelope. The steady-state estimates of the three object states (position, object velocity, and pattern velocity) are plotted for different positional uncertainties. At low positional uncertainty, most of the retinal texture motion is correctly attributed to the pattern motion. Consequently, illusory object motion and MIPS are negligible. At high positional uncertainty, much of the texture motion is attributed to object motion (reflecting a prior that object motion is more likely than pattern motion). This results in relatively low estimated pattern velocity and large MIPS.Here, we propose and test a computational framework in which motion and position perception derive from a common mechanism that integrates sensory signals over time to track objects and infer their generative causes. The consequence of this process is a strong, bidirectional coupling between motion and position perception that provides a unifying account for a range of perceptual phenomena. These include motion-induced shifts in perceived position (3–6), perceptual speed biases (20), slowing of motions shown in visual periphery (21, 22), and the curveball illusion (16). The presented model also makes novel predictions about interactions between position and motion perception—predictions confirmed here. Importantly, we do not fit the model to each experiment, but fit the parameters to data from experiment 1 and show that the resulting model accurately predicts subjects’ performance in qualitatively different and more complex tasks (experiments 2 and 3). |
| |
Keywords: | visual motion perception Kalman filter object tracking causal inference motion-induced position shift |
|
|