首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Lee D  Quessy S 《Vision research》2003,43(13):1455-1463
Whether searching for targets in a familiar scene leads to improved performance was tested in monkeys. We found that search performance improved for a familiar scene when target locations were always randomized. However, when target locations repeatedly followed a predictable sequence, performance improvement for a familiar scene was manifested only for targets presented in a familiar sequence, suggesting that scene memory might be masked by the learning of target sequences. These results suggest that information about a visual scene can facilitate the performance of visual search, and that this memory is coupled to the learned sequence of target locations.  相似文献   

2.
How do spatial constraints and meaningful scene regions interact to control overt attention during visual search for objects in real-world scenes? To answer this question, we combined novel surface maps of the likely locations of target objects with maps of the spatial distribution of scene semantic content. The surface maps captured likely target surfaces as continuous probabilities. Meaning was represented by meaning maps highlighting the distribution of semantic content in local scene regions. Attention was indexed by eye movements during the search for target objects that varied in the likelihood they would appear on specific surfaces. The interaction between surface maps and meaning maps was analyzed to test whether fixations were directed to meaningful scene regions on target-related surfaces. Overall, meaningful scene regions were more likely to be fixated if they appeared on target-related surfaces than if they appeared on target-unrelated surfaces. These findings suggest that the visual system prioritizes meaningful scene regions on target-related surfaces during visual search in scenes.  相似文献   

3.
Modeling the influence of task on attention   总被引:12,自引:0,他引:12  
We propose a computational model for the task-specific guidance of visual attention in real-world scenes. Our model emphasizes four aspects that are important in biological vision: determining task-relevance of an entity, biasing attention for the low-level visual features of desired targets, recognizing these targets using the same low-level features, and incrementally building a visual map of task-relevance at every scene location. Given a task definition in the form of keywords, the model first determines and stores the task-relevant entities in working memory, using prior knowledge stored in long-term memory. It attempts to detect the most relevant entity by biasing its visual attention system with the entity's learned low-level features. It attends to the most salient location in the scene, and attempts to recognize the attended object through hierarchical matching against object representations stored in long-term memory. It updates its working memory with the task-relevance of the recognized entity and updates a topographic task-relevance map with the location and relevance of the recognized entity. The model is tested on three types of tasks: single-target detection in 343 natural and synthetic images, where biasing for the target accelerates target detection over twofold on average; sequential multiple-target detection in 28 natural images, where biasing, recognition, working memory and long term memory contribute to rapidly finding all targets; and learning a map of likely locations of cars from a video clip filmed while driving on a highway. The model's performance on search for single features and feature conjunctions is consistent with existing psychophysical data. These results of our biologically-motivated architecture suggest that the model may provide a reasonable approximation to many brain processes involved in complex task-driven visual behaviors.  相似文献   

4.
There is accumulating evidence that scene context can guide and facilitate visual search (e.g., A. Torralba, A. Oliva, M. S. Castelhano, & J. M. Henderson, 2006). Previous studies utilized stimuli of restricted size, a fixed head position, and context defined by the global spatial configuration of the scene. Thus, it is unknown whether similar effects generalize to natural viewing environments and to context defined by local object co-occurrence. Here, with a mobile eye tracker, we investigated the effects of object co-occurrence on search performance under naturalistic conditions. Observers searched for low-visibility target objects on tables cluttered with everyday objects. Targets were either located adjacent to larger, more visible "cue" objects that they regularly co-occurred in natural scenes (expected condition) or elsewhere in the display, surrounded by unrelated objects (unexpected condition). Mean search times were shorter for targets at expected locations as compared to unexpected locations. Additionally, context guided eye movements, as more fixations were directed toward cue objects than other non-target objects, particularly when the cue was contextually relevant to the current search target. These results could not be accounted for by image saliency models. Thus, we conclude that object co-occurrence can serve as a contextual cue to facilitate search and guide eye movements in natural environments.  相似文献   

5.
We assessed the role of saliency in driving observers to fixate the eyes in social scenes. Saliency maps (Itti & Koch, 2000) were computed for the scenes from three previous studies. Saliency provided a poor account of the data. The saliency values for the first-fixated locations were extremely low and no greater than what would be expected by chance. In addition, the saliency values for the eye regions were low. Furthermore, whereas saliency was no better at predicting early saccades than late saccades, the average latency to fixate social areas of the scene (e.g., the eyes) was very fast (within 200 ms). Thus, visual saliency does not account for observers’ bias to select the eyes within complex social scenes, nor does it account for fixation behavior in general. Instead, it appears that observers’ fixations are driven largely by their default interest in social information.  相似文献   

6.
This study examines saccade strategy in a novel task where observers actively search a display to find multiple targets in a limited time. Theory predicts that the relative merit of different saccade strategies depends on the prior probability of the target at a location: when the target prior is low and multiple-target trials are rare, making a saccade to the most likely target location is close to the optimal strategy, but when the target prior is high and multiple-target trials are frequent, selecting uncertain locations is more informative. The prior probability of the target was varied from 0.17 to 0.67 to determine whether observers adjusted their saccades strategies to maximize information. Observers actively searched a noisy display with six potential target locations. Each location had an independent probability of a target, so the number of targets in a trial ranged from 0 to 6. For all target priors ranging from low to high, a trial-by-trial analysis of saccade strategy indicated that observers made saccades to the most likely target location more often than the most uncertain location. Fixating likely locations is efficient only when multiple targets are rare, as in the case of a low target prior, or in the case of the more standard single-target search task. Yet it is the preferred saccade strategy in all our conditions, even when multiple targets are frequent. These findings indicate that humans are far from ideal searchers in multiple-target search.  相似文献   

7.
Hwang AD  Wang HC  Pomplun M 《Vision research》2011,51(10):1192-1205
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control.  相似文献   

8.
Visual search in natural scenes is a complex task relying on peripheral vision to detect potential targets and central vision to verify them. The segregation of the visual fields has been particularly established by on-screen experiments. We conducted a gaze-contingent experiment in virtual reality in order to test how the perceived roles of central and peripheral visions translated to more natural settings. The use of everyday scenes in virtual reality allowed us to study visual attention by implementing a fairly ecological protocol that cannot be implemented in the real world. Central or peripheral vision was masked during visual search, with target objects selected according to scene semantic rules. Analyzing the resulting search behavior, we found that target objects that were not spatially constrained to a probable location within the scene impacted search measures negatively. Our results diverge from on-screen studies in that search performances were only slightly affected by central vision loss. In particular, a central mask did not impact verification times when the target was grammatically constrained to an anchor object. Our findings demonstrates that the role of central vision (up to 6 degrees of eccentricities) in identifying objects in natural scenes seems to be minor, while the role of peripheral preprocessing of targets in immersive real-world searches may have been underestimated by on-screen experiments.  相似文献   

9.
Studies of scene perception have shown that the visual system is particularly sensitive to global properties such as the overall layout of a scene. Such global properties cannot be computed locally, but rather require relational analysis over multiple regions. To what extent is observers’ perception of scenes impaired in the far periphery? We examined the perception of global scene properties (Experiment 1) and basic-level categories (Experiment 2) presented in the periphery from 10° to 70°. Pairs of scene photographs were simultaneously presented left and right of fixation for 80 ms on a panoramic screen (5 m diameter) covering the whole visual field while central fixation was controlled. Observers were instructed to press a key corresponding to the spatial location left/right of a pre-defined target property or category. The results show that classification of global scene properties (e.g., naturalness, openness) as well as basic-level categorization (e.g., forests, highways), while better near the center, were accomplished with a performance highly above chance (around 70% correct) in the far periphery even at 70° eccentricity. The perception of some global properties (e.g., naturalness) was more robust in peripheral vision than others (e.g., indoor/outdoor) that required a more local analysis. The results are consistent with studies suggesting that scene gist recognition can be accomplished by the low resolution of peripheral vision.  相似文献   

10.
Inhibition of Return (IOR) is a difficulty in processing stimuli presented at recently attended locations. IOR is widely believed to facilitate foraging of a visual scene by decreasing the probability that gaze will return to previously fixated locations. However, there is a lack of clear evidence in support of the foraging facilitator hypothesis during scene search. The original R. M. Klein and W. J. MacInnes' (1999) Where's Waldo study reported a forward bias in the distribution of fixations that was taken as evidence for the foraging facilitator hypothesis. The present study was designed to replicate R. M. Klein and W. J. MacInnes' (1999) but include detailed analysis of fixation distributions in order to test the precise predictions of the foraging facilitator hypothesis. The results indicate that latencies of saccades returning to 1-back (and possibly 2-back) locations during visual search are elevated. However, there is no evidence that the probability of returning to these locations is significantly less than control locations. Eye movement behavior during search of visual scenes does not support the view that IOR facilitates foraging.  相似文献   

11.
The direction in which people tend to move their eyes when inspecting images can reveal the different influences on eye guidance in scene perception, and their time course. We investigated biases in saccade direction during a memory-encoding task with natural scenes and computer-generated fractals. Images were rotated to disentangle egocentric and image-based guidance. Saccades in fractals were more likely to be horizontal, regardless of orientation. In scenes, the first saccade often moved down and subsequent eye movements were predominantly vertical, relative to the scene. These biases were modulated by the distribution of visual features (saliency and clutter) in the scene. The results suggest that image orientation, visual features and the scene frame-of-reference have a rapid effect on eye guidance.  相似文献   

12.
Pomplun M 《Vision research》2006,46(12):1886-1900
Visual search is a fundamental and routine task of everyday life. Studying visual search promises to shed light on the basic attentional mechanisms that facilitate visual processing. To investigate visual attention during search processes, numerous studies measured the selectivity of observers' saccadic eye movements for local display features. These experiments almost entirely relied on simple, artificial displays with discrete search items and features. The present study employed complex search displays and targets to examine task-driven (top-down) visual guidance by low-level features under more natural conditions. Significant guidance by local intensity, contrast, spatial frequency, and orientation was found, and its properties such as magnitude and resolution were analyzed across dimensions. Moreover, feature-ratio effects were detected, which correspond to distractor-ratio effects in simple search displays. These results point out the limitations of current purely stimulus-driven (bottom-up) models of attention during scene perception.  相似文献   

13.
Are locations or colors more effective cues in biasing attention? We addressed this question with a visual search task that featured an associative priming manipulation. The observers indicated which target appeared in a search array. Unknown to them, one target appeared at the same location more often and a second target appeared in the same color more often. Both location and color biases facilitated performance, but location biases benefited the selection of all targets, whereas color biases only benefited the associated target letter. The generalized benefit of location biases suggests that locations are more effective cues to attention.  相似文献   

14.
The gist of a visual scene is perceived in a fraction of a second but in change detection tasks subjects typically need several seconds to find the changing object in a visual scene. Here, we report influences of scene context on change detection performance. Scene context manipulations consisted of scene inversion, scene jumbling, where the images were cut into 24 pieces and randomly recombined, and scene configuration scrambling, where the arrangement of the objects in the scene was randomized. Reaction times, where significantly lower in images with normal scene context. We conclude that scene context structures scene perception.  相似文献   

15.
In seven experiments, observers searched for a scrambled object among normal objects. The critical comparison was between repeated search in which the same set of stimuli remained present in fixed positions in the display for many (>100) trials and unrepeated conditions in which new stimuli were presented on each trial. In repeated search conditions, observers monitored an essentially stable display for the disruption of a clearly visible object. This is an extension of repeated search experiments in which subjects search a fixed set of items for different targets on each trial (Wolfe, Klempen, & Dahlen, 2000) and can be considered as a form of a "change blindness" task. The unrepeated search was very inefficient, showing that a scrambled object does not "pop-out" among intact objects (or vice versa). Interestingly, the repeated search condition was just as inefficient, as if participants had to search for the scrambled target even after extensive experience with the specific change in the specific scene. The results suggest that the attentional processes involved in searching for a target in a novel scene may be very similar to those used to confirm the presence of a target in a familiar scene.  相似文献   

16.
We address two questions concerning eye guidance during visual search in naturalistic scenes. First, search has been described as a task in which visual salience is unimportant. Here, we revisit this question by using a letter-in-scene search task that minimizes any confounding effects that may arise from scene guidance. Second, we investigate how important the different regions of the visual field are for different subprocesses of search (target localization, verification). In Experiment 1, we manipulated both the salience (low vs. high) and the size (small vs. large) of the target letter (a “T”), and we implemented a foveal scotoma (radius: 1°) in half of the trials. In Experiment 2, observers searched for high- and low-salience targets either with full vision or with a central or peripheral scotoma (radius: 2.5°). In both experiments, we found main effects of salience with better performance for high-salience targets. In Experiment 1, search was faster for large than for small targets, and high-salience helped more for small targets. When searching with a foveal scotoma, performance was relatively unimpaired regardless of the target''s salience and size. In Experiment 2, both visual-field manipulations led to search time costs, but the peripheral scotoma was much more detrimental than the central scotoma. Peripheral vision proved to be important for target localization, and central vision for target verification. Salience affected eye movement guidance to the target in both central and peripheral vision. Collectively, the results lend support for search models that incorporate salience for predicting eye-movement behavior.  相似文献   

17.
Mitchell JF  Zipser D 《Vision research》2003,43(25):2669-2695
We present a neural model of the frontal eye fields. It consists of several retinotopic arrays of neuron-like units that are recurrently connected. The network is trained to make memory-guided saccades to sequentially flashed targets that appear at arbitrary locations. This task is interesting because the large number of possible sequences does not permit a pre-learned response. Instead locations and their priority must be maintained in active working memory. The network learns to perform the task. Surprisingly, after training it can also select targets in visual search tasks. When targets are shown in parallel it chooses them according to their salience. Its search behavior is comparable to that of humans. It exhibits saccadic averaging, increased reaction times with more distractors, latency vs accuracy trade-offs, and inhibition of return. Analysis of the network shows that it operates like a queue, storing the potential targets in sequence for later execution. A small number of unit types are sufficient to encode this information, but the manner of coding is non-obvious. Units respond to multiple targets similar to quasi-visual cells recently studied [Exp. Brain Res. 130 (2000) 433]. Predictions are made that can be experimentally tested.  相似文献   

18.
Humans can remember many scenes for a long time after brief presentation. Do scene understanding and encoding processes require visual selective attention, or do they occur even when observers are engaged in other visual tasks? We showed observers scene or texture images while they performed a visual search task, an auditory detection task, or no concurrent task. Concurrent tasks interfered with memory for both image types. Visual search interfered more than effects of auditory detection even when the two tasks were equally difficult. The same pattern of results was obtained with concurrent tasks presented during the encoding or consolidation phases. We conclude that visual attention modulates picture memory performance. We did not find any aspect of picture memory to be independent of attentional demands.  相似文献   

19.
Ground-planes have an important influence on the perception of 3D space (Gibson, 1950) and it has been shown that the assumption that a ground-plane is present in the scene plays a role in the perception of object distance (Bruno & Cutting, 1988). Here, we investigate whether this influence is exerted at an early stage of processing, to affect the rapid estimation of 3D size. Participants performed a visual search task in which they searched for a target object that was larger or smaller than distracter objects. Objects were presented against a background that contained either a frontoparallel or slanted 3D surface, defined by texture gradient cues. We measured the effect on search performance of target location within the scene (near vs. far) and how this was influenced by scene orientation (which, e.g., might be consistent with a ground or ceiling plane, etc.). In addition, we investigated how scene orientation interacted with texture gradient information (indicating surface slant), to determine how these separate cues to scene layout were combined. We found that the difference in target detection performance between targets at the front and rear of the simulated scene was maximal when the scene was consistent with a ground-plane - consistent with the use of an elevation cue to object distance. In addition, we found a significant increase in the size of this effect when texture gradient information (indicating surface slant) was present, but no interaction between texture gradient and scene orientation information. We conclude that scene orientation plays an important role in the estimation of 3D size at an early stage of processing, and suggest that elevation information is linearly combined with texture gradient information for the rapid estimation of 3D size.  相似文献   

20.
Visual cognition depends critically on the moment-to-moment orientation of gaze. To change the gaze to a new location in space, that location must be computed and used by the oculomotor system. One of the most common sources of information for this computation is the visual appearance of an object. A crucial question is: How is the appearance information contained in the photometric array is converted into a target position? This paper proposes a such a model that accomplishes this calculation. The model uses iconic scene representations derived from oriented spatiochromatic filters at multiple scales. Visual search for a target object proceeds in a coarse-to-fine fashion with the target's largest scale filter responses being compared first. Task-relevant target locations are represented as saliency maps which are used to program eye movements. A central feature of the model is that it separates the targeting process, which changes gaze, from the decision process, which extracts information at or near the new gaze point to guide behavior. The model provides a detailed explanation for center-of-gravity saccades that have been observed in many previous experiments. In addition, the model's targeting performance has been compared with the eye movements of human subjects under identical conditions in natural visual search tasks. The results show good agreement both quantitatively (the search paths are strikingly similar) and qualitatively (the fixations of false targets are comparable).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号