首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Modeling the influence of task on attention   总被引:12,自引:0,他引:12  
We propose a computational model for the task-specific guidance of visual attention in real-world scenes. Our model emphasizes four aspects that are important in biological vision: determining task-relevance of an entity, biasing attention for the low-level visual features of desired targets, recognizing these targets using the same low-level features, and incrementally building a visual map of task-relevance at every scene location. Given a task definition in the form of keywords, the model first determines and stores the task-relevant entities in working memory, using prior knowledge stored in long-term memory. It attempts to detect the most relevant entity by biasing its visual attention system with the entity's learned low-level features. It attends to the most salient location in the scene, and attempts to recognize the attended object through hierarchical matching against object representations stored in long-term memory. It updates its working memory with the task-relevance of the recognized entity and updates a topographic task-relevance map with the location and relevance of the recognized entity. The model is tested on three types of tasks: single-target detection in 343 natural and synthetic images, where biasing for the target accelerates target detection over twofold on average; sequential multiple-target detection in 28 natural images, where biasing, recognition, working memory and long term memory contribute to rapidly finding all targets; and learning a map of likely locations of cars from a video clip filmed while driving on a highway. The model's performance on search for single features and feature conjunctions is consistent with existing psychophysical data. These results of our biologically-motivated architecture suggest that the model may provide a reasonable approximation to many brain processes involved in complex task-driven visual behaviors.  相似文献   

2.
Scene context guides eye movements during visual search   总被引:2,自引:0,他引:2  
How does scene context guide search behavior to likely target locations? We had observers search for scene-constrained and scene-unconstrained targets, and found that scene-constrained targets were detected faster and with fewer eye movements. Observers also directed more initial saccades to target-consistent scene regions and devoted more time to searching these regions. However, final checking fixations on target-inconsistent regions were common in target-absent trials, suggesting that scene context does not strictly confine search to likely target locations. We interpret these data as evidence for a rapid top-down biasing of search behavior by scene context to the target-consistent regions of a scene.  相似文献   

3.
There is accumulating evidence that scene context can guide and facilitate visual search (e.g., A. Torralba, A. Oliva, M. S. Castelhano, & J. M. Henderson, 2006). Previous studies utilized stimuli of restricted size, a fixed head position, and context defined by the global spatial configuration of the scene. Thus, it is unknown whether similar effects generalize to natural viewing environments and to context defined by local object co-occurrence. Here, with a mobile eye tracker, we investigated the effects of object co-occurrence on search performance under naturalistic conditions. Observers searched for low-visibility target objects on tables cluttered with everyday objects. Targets were either located adjacent to larger, more visible "cue" objects that they regularly co-occurred in natural scenes (expected condition) or elsewhere in the display, surrounded by unrelated objects (unexpected condition). Mean search times were shorter for targets at expected locations as compared to unexpected locations. Additionally, context guided eye movements, as more fixations were directed toward cue objects than other non-target objects, particularly when the cue was contextually relevant to the current search target. These results could not be accounted for by image saliency models. Thus, we conclude that object co-occurrence can serve as a contextual cue to facilitate search and guide eye movements in natural environments.  相似文献   

4.
Humans can remember many scenes for a long time after brief presentation. Do scene understanding and encoding processes require visual selective attention, or do they occur even when observers are engaged in other visual tasks? We showed observers scene or texture images while they performed a visual search task, an auditory detection task, or no concurrent task. Concurrent tasks interfered with memory for both image types. Visual search interfered more than effects of auditory detection even when the two tasks were equally difficult. The same pattern of results was obtained with concurrent tasks presented during the encoding or consolidation phases. We conclude that visual attention modulates picture memory performance. We did not find any aspect of picture memory to be independent of attentional demands.  相似文献   

5.

Background

This study investigated whether realistic immersive conditions with dynamic indoor scenes presented on a large, hemispheric panoramic screen covering 180° of the visual field improved the visual search abilities of participants with age‐related macular degeneration (AMD).

Method

Twenty‐one participants with AMD, 16 age‐matched controls and 16 young observers were included. Realistic indoor scenes were presented on a panoramic five metre diameter screen. Twelve different objects were used as targets. The participants were asked to search for a target object, shown on paper before each trial, within a room composed of various objects. A joystick was used for navigation within the scene views. A target object was present in 24 trials and absent in 24 trials. The percentage of correct detection of the target, the percentage of false alarms (that is, the detection of the target when it was absent), the number of scene views explored and the search time were measured.

Results

The search time was slower for participants with AMD than for the age‐matched controls, who in turn were slower than the young participants. The participants with AMD were able to accomplish the task with a performance of 75 per cent correct detections. This was slightly lower than older controls (79.2 per cent) while young controls were at ceiling (91.7 per cent). Errors were mainly due to false alarms resulting from confusion between the target object and another object present in the scene in the target‐absent trials.

Conclusion

The outcomes of the present study indicate that, under realistic conditions, although slower than age‐matched, normally sighted controls, participants with AMD were able to accomplish visual searches of objects with high accuracy.  相似文献   

6.
How do spatial constraints and meaningful scene regions interact to control overt attention during visual search for objects in real-world scenes? To answer this question, we combined novel surface maps of the likely locations of target objects with maps of the spatial distribution of scene semantic content. The surface maps captured likely target surfaces as continuous probabilities. Meaning was represented by meaning maps highlighting the distribution of semantic content in local scene regions. Attention was indexed by eye movements during the search for target objects that varied in the likelihood they would appear on specific surfaces. The interaction between surface maps and meaning maps was analyzed to test whether fixations were directed to meaningful scene regions on target-related surfaces. Overall, meaningful scene regions were more likely to be fixated if they appeared on target-related surfaces than if they appeared on target-unrelated surfaces. These findings suggest that the visual system prioritizes meaningful scene regions on target-related surfaces during visual search in scenes.  相似文献   

7.
Ground-planes have an important influence on the perception of 3D space (Gibson, 1950) and it has been shown that the assumption that a ground-plane is present in the scene plays a role in the perception of object distance (Bruno & Cutting, 1988). Here, we investigate whether this influence is exerted at an early stage of processing, to affect the rapid estimation of 3D size. Participants performed a visual search task in which they searched for a target object that was larger or smaller than distracter objects. Objects were presented against a background that contained either a frontoparallel or slanted 3D surface, defined by texture gradient cues. We measured the effect on search performance of target location within the scene (near vs. far) and how this was influenced by scene orientation (which, e.g., might be consistent with a ground or ceiling plane, etc.). In addition, we investigated how scene orientation interacted with texture gradient information (indicating surface slant), to determine how these separate cues to scene layout were combined. We found that the difference in target detection performance between targets at the front and rear of the simulated scene was maximal when the scene was consistent with a ground-plane - consistent with the use of an elevation cue to object distance. In addition, we found a significant increase in the size of this effect when texture gradient information (indicating surface slant) was present, but no interaction between texture gradient and scene orientation information. We conclude that scene orientation plays an important role in the estimation of 3D size at an early stage of processing, and suggest that elevation information is linearly combined with texture gradient information for the rapid estimation of 3D size.  相似文献   

8.
Visual search in natural scenes is a complex task relying on peripheral vision to detect potential targets and central vision to verify them. The segregation of the visual fields has been particularly established by on-screen experiments. We conducted a gaze-contingent experiment in virtual reality in order to test how the perceived roles of central and peripheral visions translated to more natural settings. The use of everyday scenes in virtual reality allowed us to study visual attention by implementing a fairly ecological protocol that cannot be implemented in the real world. Central or peripheral vision was masked during visual search, with target objects selected according to scene semantic rules. Analyzing the resulting search behavior, we found that target objects that were not spatially constrained to a probable location within the scene impacted search measures negatively. Our results diverge from on-screen studies in that search performances were only slightly affected by central vision loss. In particular, a central mask did not impact verification times when the target was grammatically constrained to an anchor object. Our findings demonstrates that the role of central vision (up to 6 degrees of eccentricities) in identifying objects in natural scenes seems to be minor, while the role of peripheral preprocessing of targets in immersive real-world searches may have been underestimated by on-screen experiments.  相似文献   

9.
Are locations or colors more effective cues in biasing attention? We addressed this question with a visual search task that featured an associative priming manipulation. The observers indicated which target appeared in a search array. Unknown to them, one target appeared at the same location more often and a second target appeared in the same color more often. Both location and color biases facilitated performance, but location biases benefited the selection of all targets, whereas color biases only benefited the associated target letter. The generalized benefit of location biases suggests that locations are more effective cues to attention.  相似文献   

10.
In two samples, we demonstrate that visual search performance is influenced by memory for the locations of specific search items across trials. We monitored eye movements as observers searched for a target letter in displays containing 16 or 24 letters. From trial to trial the configuration of the search items was either Random, fully Repeated or similar but not identical (i.e., Intermediate). We found a graded pattern of response times across conditions with slowest times in the Random condition and fastest responses in the Repeated condition. We also found that search was comparably efficient in the Intermediate and Random conditions but more efficient in the Repeated condition. Importantly, the target on a given trial was fixated more accurately in the Repeated and Intermediate conditions relative to the Random condition. We suggest a tradeoff between memory and perception in search as a function of the physical scale of the search space.  相似文献   

11.
In seven experiments, observers searched for a scrambled object among normal objects. The critical comparison was between repeated search in which the same set of stimuli remained present in fixed positions in the display for many (>100) trials and unrepeated conditions in which new stimuli were presented on each trial. In repeated search conditions, observers monitored an essentially stable display for the disruption of a clearly visible object. This is an extension of repeated search experiments in which subjects search a fixed set of items for different targets on each trial (Wolfe, Klempen, & Dahlen, 2000) and can be considered as a form of a "change blindness" task. The unrepeated search was very inefficient, showing that a scrambled object does not "pop-out" among intact objects (or vice versa). Interestingly, the repeated search condition was just as inefficient, as if participants had to search for the scrambled target even after extensive experience with the specific change in the specific scene. The results suggest that the attentional processes involved in searching for a target in a novel scene may be very similar to those used to confirm the presence of a target in a familiar scene.  相似文献   

12.
Champion RA  Warren PA 《Vision research》2008,48(17):1820-1830
In order to compute a representation of an object's size within a 3D scene, the visual system must scale retinal size by an estimate of the distance to the object. Evidence from size discrimination and visual search studies suggests that we have no access to the representation of retinal size when performing such tasks. In this study we investigate whether observers have early access to retinal size prior to scene size. Observer performance was assessed in a visual search task (requiring search within a 3D scene) in which processing was interrupted at a range of short presentation times. If observers have access to retinal size then we might expect to find a presentation time before which observers behave as if using retinal size and after which they behave as if using scene size. Observers searched for a larger or smaller target object within a group of objects viewed against a textured plane slanted at 0 degrees or 60 degrees . Stimuli were presented for 100, 200, 400 or 800ms and immediately followed by a mask. We measured the effect of target location within a stimulus (near vs. far) on task performance and how this was influenced by the background slant. The results of experiments 1 and 2 suggest that background slant had a significant influence on performance at all presentation times consistent with the use of scene size and not retinal size. Experiment 3 shows that this finding cannot be explained by a 2D texture contrast effect. Experiment 4 indicates that contextual information learned across a block of trials could be an important factor in such visual search experiments. In spite of this finding, our results suggest that distance scaling may occur prior to 100ms and we find no clear evidence for explicit access to a retinal representation of size.  相似文献   

13.
Visual scene memory and the guidance of saccadic eye movements.   总被引:2,自引:0,他引:2  
D Melcher  E Kowler 《Vision research》2001,41(25-26):3597-3611
An unresolved question is how much information can be remembered from visual scenes when they are inspected by saccadic eye movements. Subjects used saccadic eye movements to scan a computer-generated scene, and afterwards, recalled as many objects as they could. Scene memory was quite good: it improved with display duration, it persisted over time long after the display was removed, and it continued to accumulate with additional viewings of the same display (Melcher, D. (2001) The persistance of memory for scenes. Nature 412, 401). The occurrence of saccadic eye movements was important to ensure good recall performance, even though subjects often recalled non-fixated objects. Inter-saccadic intervals increased with display duration, showing an influence of duration on global scanning strategy. The choice of saccadic target was predicted by a Random Selection with Distance Weighting (RSDW) model, in which the target for each saccade is selected at random from all available objects, weighted according to distance from fixation, regardless of which objects had previously been fixated. The results show that the visual memory that was reflected in the recall reports was not utilized for the immediate decision about where to look in the scene. Visual memory can be excellent, but it is not always reflected in oculomotor measures, perhaps because the cost of rapid on-line memory retrieval is too great.  相似文献   

14.
Visual cognition depends critically on the moment-to-moment orientation of gaze. To change the gaze to a new location in space, that location must be computed and used by the oculomotor system. One of the most common sources of information for this computation is the visual appearance of an object. A crucial question is: How is the appearance information contained in the photometric array is converted into a target position? This paper proposes a such a model that accomplishes this calculation. The model uses iconic scene representations derived from oriented spatiochromatic filters at multiple scales. Visual search for a target object proceeds in a coarse-to-fine fashion with the target's largest scale filter responses being compared first. Task-relevant target locations are represented as saliency maps which are used to program eye movements. A central feature of the model is that it separates the targeting process, which changes gaze, from the decision process, which extracts information at or near the new gaze point to guide behavior. The model provides a detailed explanation for center-of-gravity saccades that have been observed in many previous experiments. In addition, the model's targeting performance has been compared with the eye movements of human subjects under identical conditions in natural visual search tasks. The results show good agreement both quantitatively (the search paths are strikingly similar) and qualitatively (the fixations of false targets are comparable).  相似文献   

15.
Daily activities require the constant searching and tracking of visual targets in dynamic and complex scenes. Classic work assessing visual search performance has been dominated by the use of simple geometric shapes, patterns, and static backgrounds. Recently, there has been a shift toward investigating visual search in more naturalistic dynamic scenes using virtual reality (VR)-based paradigms. In this direction, we have developed a first-person perspective VR environment combined with eye tracking for the capture of a variety of objective measures. Participants were instructed to search for a preselected human target walking in a crowded hallway setting. Performance was quantified based on saccade and smooth pursuit ocular motor behavior. To assess the effect of task difficulty, we manipulated factors of the visual scene, including crowd density (i.e., number of surrounding distractors) and the presence of environmental clutter. In general, results showed a pattern of worsening performance with increasing crowd density. In contrast, the presence of visual clutter had no effect. These results demonstrate how visual search performance can be investigated using VR-based naturalistic dynamic scenes and with high behavioral relevance. This engaging platform may also have utility in assessing visual search in a variety of clinical populations of interest.  相似文献   

16.
Planning sequences of saccades   总被引:5,自引:0,他引:5  
C M Zingale  E Kowler 《Vision research》1987,27(8):1327-1341
Subjects used saccades to fixate a sequence of 1-5 stationary targets (separation = 90') located at the vertices of an imaginary pentagon. The latency of the first saccade in a sequence and the duration of intervals between subsequent saccades increased with sequence length at a rate of about 20 msec/target. Latency also varied with ordinal position in the sequence. These results were not due to directional differences in saccadic latency nor to latency-accuracy or to latency-precision trade-offs. Results were similar when targets were removed and saccades were directed to remembered locations. These effects may be best accounted for by models that have been proposed to account for similar effects of sequence length and ordinal position on other voluntary motor tasks, such as typing, speech and finger-tapping. In these models motor programs for a sequence of responses are planned before execution and then retrieved from memory during execution. These models are fundamentally different from the traditional saccadic models in which visual error signals evoke saccades. Instead, we propose that saccades are controlled by an organized plan for an entire sequence of saccades. Visual error signals may modify or elaborate the plans during the execution of a sequence. Our proposal is consistent with ideas developed by Lashley [Cerebral Mechanisms in Behavior: The Hixon Symposium. Wiley, New York (1951)] in his general treatment of the central organization that determines voluntary motor performance.  相似文献   

17.
Human performance during visual search typically improves when spatial cues indicate the possible target locations. In many instances, the performance improvement is quantitatively predicted by a Bayesian or quasi-Bayesian observer in which visual attention simply selects the information at the cued locations without changing the quality of processing or sensitivity and ignores the information at the uncued locations. Aside from the general good agreement between the effect of the cue on model and human performance, there has been little independent confirmation that humans are effectively selecting the relevant information. In this study, we used the classification image technique to assess the effectiveness of spatial cues in the attentional selection of relevant locations and suppression of irrelevant locations indicated by spatial cues. Observers searched for a bright target among dimmer distractors that might appear (with 50% probability) in one of eight locations in visual white noise. The possible target location was indicated using a 100% valid box cue or seven 100% invalid box cues in which the only potential target locations was uncued. For both conditions, we found statistically significant perceptual templates shaped as differences of Gaussians at the relevant locations with no perceptual templates at the irrelevant locations. We did not find statistical significant differences between the shapes of the inferred perceptual templates for the 100% valid and 100% invalid cues conditions. The results confirm the idea that during search visual attention allows the observer to effectively select relevant information and ignore irrelevant information. The results for the 100% invalid cues condition suggests that the selection process is not drawn automatically to the cue but can be under the observers' voluntary control.  相似文献   

18.
Participants’ eye-movements were monitored while they searched for a target among a varying number of distractors either with or without a concurrent memory load. Consistent with previous findings, adding a memory load slowed response times without affecting search slopes; a finding normally taken to imply that memory load affects pre- and/or post-search processes but not the search process itself. However, when overall response times were decomposed using eye-movement data into pre-search (e.g., initial encoding), search, and post-search (e.g., response selection) phases, analysis revealed that adding a memory load affected all phases, including the search phase. In addition, we report that fixations selected under load were more likely to be distant from search items, and more likely to be close to previously inspected locations. Thus, memory load affects the search process without affecting search slopes. These results challenge standard interpretations of search slopes and main effects in visual search.  相似文献   

19.
Researchers have investigated whether attentional capture during visual search is driven by top-down processes, i.e. experimental goals and directives, or by bottom-up processes, i.e. the properties of the items within a search display. Some research has demonstrated that subjects cannot avoid attending to a task-irrelevant salient item, such as a singleton distractor, even when the identity of the target item is known. Research has also shown that repeating the target feature across successive search displays will prime the visual pop out effect for a unique target (priming of pop out). However, other research has shown that subjects can strategically guide their attention and may locate a target based on its uniqueness (a singleton search mode) or based on knowing and searching for the target feature (a feature search mode). When using the feature search mode subjects are attuned to the specific target feature and are therefore less susceptible to singleton distractor interference than when using the singleton search mode. Recent research has compared singleton distractor interference for targets that are variable and uncertain to targets that are constant and certain across search displays. When the target is constant subjects can use a feature search mode and should theoretically demonstrate less singleton distractor interference than when targets are variable and they must use a singleton search mode. Indeed, variable targets have historically demonstrated greater singleton distractor interference than constant targets, even when the target feature has been repeated. However, the current experiments found that singleton distractor interference was no greater for variable targets than for constant targets when targets and nontargets did not share shapes across search displays.  相似文献   

20.
Mitchell JF  Zipser D 《Vision research》2003,43(25):2669-2695
We present a neural model of the frontal eye fields. It consists of several retinotopic arrays of neuron-like units that are recurrently connected. The network is trained to make memory-guided saccades to sequentially flashed targets that appear at arbitrary locations. This task is interesting because the large number of possible sequences does not permit a pre-learned response. Instead locations and their priority must be maintained in active working memory. The network learns to perform the task. Surprisingly, after training it can also select targets in visual search tasks. When targets are shown in parallel it chooses them according to their salience. Its search behavior is comparable to that of humans. It exhibits saccadic averaging, increased reaction times with more distractors, latency vs accuracy trade-offs, and inhibition of return. Analysis of the network shows that it operates like a queue, storing the potential targets in sequence for later execution. A small number of unit types are sufficient to encode this information, but the manner of coding is non-obvious. Units respond to multiple targets similar to quasi-visual cells recently studied [Exp. Brain Res. 130 (2000) 433]. Predictions are made that can be experimentally tested.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号