首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Interesting objects are visually salient   总被引:3,自引:0,他引:3  
How do we decide which objects in a visual scene are more interesting? While intuition may point toward high-level object recognition and cognitive processes, here we investigate the contributions of a much simpler process, low-level visual saliency. We used the LabelMe database (24,863 photographs with 74,454 manually outlined objects) to evaluate how often interesting objects were among the few most salient locations predicted by a computational model of bottom-up attention. In 43% of all images the model's predicted most salient location falls within a labeled region (chance 21%). Furthermore, in 76% of the images (chance 43%), one or more of the top three salient locations fell on an outlined object, with performance leveling off after six predicted locations. The bottom-up attention model has neither notion of object nor notion of semantic relevance. Hence, our results indicate that selecting interesting objects in a scene is largely constrained by low-level visual properties rather than solely determined by higher cognitive processes.  相似文献   

2.
Saliency models of eye guidance during scene perception suggest that attention is drawn to visually conspicuous areas having high visual salience. Despite such low-level visual processes controlling the allocation of attention, higher level information gained from scene knowledge may also control eye movements. This is supported by the findings of eye-tracking studies demonstrating that scene-inconsistent objects are often fixated earlier than their consistent counterparts. Using a change blindness paradigm, changes were made to objects that were either consistent or inconsistent with the scene and that had been measured as having high or low visual salience (according to objective measurements). Results showed that change detection speed and accuracy for objects with high visual salience did not differ from those having low visual salience. However, changes in scene-inconsistent objects were detected faster and with higher accuracy than those in scene-consistent objects for both high and low visually salient objects. We conclude that the scene-inconsistent change detection advantage is a true top-down effect and is not confounded by low-level visual factors and may indeed override such factors when viewing complex naturalistic scenes.  相似文献   

3.
An influential theory suggests that integrated objects, rather than individual features, are the fundamental units that limit our capacity to temporarily store visual information (S. J. Luck & E. K. Vogel, 1997). Using a paradigm that independently estimates the number and precision of items stored in working memory (W. Zhang & S. J. Luck, 2008), here we show that the storage of features is not cost-free. The precision and number of objects held in working memory was estimated when observers had to remember either the color, the orientation, or both the color and orientation of simple objects. We found that while the quantity of stored objects was largely unaffected by increasing the number of features, the precision of these representations dramatically decreased. Moreover, this selective deterioration in object precision depended on the multiple features being contained within the same objects. Such fidelity costs were even observed with change detection paradigms when those paradigms placed demands on the precision of the stored visual representations. Taken together, these findings not only demonstrate that the maintenance of integrated features is costly; they also suggest that objects and features affect visual working memory capacity differently.  相似文献   

4.
Pomplun M 《Vision research》2006,46(12):1886-1900
Visual search is a fundamental and routine task of everyday life. Studying visual search promises to shed light on the basic attentional mechanisms that facilitate visual processing. To investigate visual attention during search processes, numerous studies measured the selectivity of observers' saccadic eye movements for local display features. These experiments almost entirely relied on simple, artificial displays with discrete search items and features. The present study employed complex search displays and targets to examine task-driven (top-down) visual guidance by low-level features under more natural conditions. Significant guidance by local intensity, contrast, spatial frequency, and orientation was found, and its properties such as magnitude and resolution were analyzed across dimensions. Moreover, feature-ratio effects were detected, which correspond to distractor-ratio effects in simple search displays. These results point out the limitations of current purely stimulus-driven (bottom-up) models of attention during scene perception.  相似文献   

5.
Humans can remember many scenes for a long time after brief presentation. Do scene understanding and encoding processes require visual selective attention, or do they occur even when observers are engaged in other visual tasks? We showed observers scene or texture images while they performed a visual search task, an auditory detection task, or no concurrent task. Concurrent tasks interfered with memory for both image types. Visual search interfered more than effects of auditory detection even when the two tasks were equally difficult. The same pattern of results was obtained with concurrent tasks presented during the encoding or consolidation phases. We conclude that visual attention modulates picture memory performance. We did not find any aspect of picture memory to be independent of attentional demands.  相似文献   

6.
Are locations or colors more effective cues in biasing attention? We addressed this question with a visual search task that featured an associative priming manipulation. The observers indicated which target appeared in a search array. Unknown to them, one target appeared at the same location more often and a second target appeared in the same color more often. Both location and color biases facilitated performance, but location biases benefited the selection of all targets, whereas color biases only benefited the associated target letter. The generalized benefit of location biases suggests that locations are more effective cues to attention.  相似文献   

7.
Trans-saccadic memory consists of keeping track of objects’ locations and features across saccades; pre-saccadic information is remembered and compared with post-saccadic information. It has been shown to have limited resources and involve attention with respect to the selection of objects and features. In support, a previous study showed that recognition of distinct post-saccadic objects in the visual scene is impaired when pre-saccadic objects are relevant and thus already encoded in memory (Poth, Herwig, Schneider, 2015). Here, we investigated the inverse (i.e. how the memory of pre-saccadic objects is affected by abrupt but irrelevant changes in the post-saccadic visual scene). We also modulated the amount of attention to the relevant pre-saccadic object by having participants either make a saccade to it or elsewhere and observed that pre-saccadic attentional facilitation affected how much post-saccadic changes disrupted trans-saccadic memory of pre-saccadic objects.Participants identified a flashed symbol (d, b, p, or q, among distracters), at one of six placeholders (figures “8”) arranged in circle around fixation while planning a saccade to one of them. They reported the identity of the symbol after the saccade. We changed the post-saccadic scene in Experiment one by removing the entire scene, only the placeholder where the pre-saccadic symbol was presented, or all other placeholders except this one. We observed reduced identification performance when only the saccade-target placeholder disappeared after the saccade. In Experiment two, we changed one placeholder location (inward/outward shift or rotation re. saccade vector) after the saccade and observed that identification performance decreased with increased shift/rotation of the saccade-target placeholder. We conclude that pre-saccadic memory is disrupted by abrupt attention-capturing post-saccadic changes of visual scene, particularly when these changes involve the object prioritized by being the goal of a saccade. These findings support the notion that limited trans-saccadic memory resources are disrupted when object correspondence at saccadic goal is broken through removal or location change.  相似文献   

8.
Saccadic eye movements and perceptual attention work in a coordinated fashion to allow selection of the objects, features or regions with the greatest momentary need for limited visual processing resources. This study investigates perceptual characteristics of pre-saccadic shifts of attention during a sequence of saccades using the visual manipulations employed to study mechanisms of attention during maintained fixation. The first part of this paper reviews studies of the connections between saccades and attention, and their significance for both saccadic control and perception. The second part presents three experiments that examine the effects of pre-saccadic shifts of attention on vision during sequences of saccades. Perceptual enhancements at the saccadic goal location relative to non-goal locations were found across a range of stimulus contrasts, with either perceptual discrimination or detection tasks, with either single or multiple perceptual targets, and regardless of the presence of external noise. The results show that the preparation of saccades can evoke a variety of attentional effects, including attentionally-mediated changes in the strength of perceptual representations, selection of targets for encoding in visual memory, exclusion of external noise, or changes in the levels of internal visual noise. The visual changes evoked by saccadic planning make it possible for the visual system to effectively use saccadic eye movements to explore the visual environment.  相似文献   

9.
Tanaka Y  Sagi D 《Vision research》2000,40(9):1089-1100
Low-contrast visual stimuli have been found to produce a memory trace, enhancing subsequent target detection for as much as 16 s. Here we show that the memory trace depends on dynamic interactions between low-level stimulus properties and a higher-level gating process. Detection of vertical targets (Gabor signals) was enhanced by preceding vertical Gabor primes, but suppressed by preceding tilted primes--pointing to a competitive process of dynamic resource allocation. The priming effect was also dependent on a temporal cue, activating a sensory gating process with maximal effect at 300-500 ms delay. The results suggest a two-step process in which attention affects transition between perception and memory: a non-selective gating process followed by competition between overlapping representations.  相似文献   

10.
Traditional memory research has focused on identifying separate memory systems and exploring different stages of memory processing. This approach has been valuable for establishing a taxonomy of memory systems and characterizing their function but has been less informative about the nature of stored memory representations. Recent research on visual memory has shifted toward a representation-based emphasis, focusing on the contents of memory and attempting to determine the format and structure of remembered information. The main thesis of this review will be that one cannot fully understand memory systems or memory processes without also determining the nature of memory representations. Nowhere is this connection more obvious than in research that attempts to measure the capacity of visual memory. We will review research on the capacity of visual working memory and visual long-term memory, highlighting recent work that emphasizes the contents of memory. This focus impacts not only how we estimate the capacity of the system--going beyond quantifying how many items can be remembered and moving toward structured representations--but how we model memory systems and memory processes.  相似文献   

11.
To understand the neural mechanisms underlying humans’ exquisite ability at processing briefly flashed visual scenes, we present a computer model that predicts human performance in a Rapid Serial Visual Presentation (RSVP) task. The model processes streams of natural scene images presented at a rate of 20 Hz to human observers, and attempts to predict when subjects will correctly detect if one of the presented images contains an animal (target). We find that metrics of Bayesian surprise, which models both spatial and temporal aspects of human attention, differ significantly between RSVP sequences on which subjects will detect the target (easy) and those on which subjects miss the target (hard). Extending beyond previous studies, we here assess the contribution of individual image features including color opponencies and Gabor edges. We also investigate the effects of the spatial location of surprise in the visual field, rather than only using a single aggregate measure. A physiologically plausible feed-forward system, which optimally combines spatial and temporal surprise metrics for all features, predicts performance in 79.5% of human trials correctly. This is significantly better than a baseline maximum likelihood Bayesian model (71.7%). We can see that attention as measured by surprise, accounts for a large proportion of observer performance in RSVP. The time course of surprise in different feature types (channels) provides additional quantitative insight in rapid bottom-up processes of human visual attention and recognition, and illuminates the phenomenon of attentional blink and lag-1 sparing. Surprise also reveals classical Type-B like masking effects intrinsic in natural image RSVP sequences. We summarize these with the discussion of a multistage model of visual attention.  相似文献   

12.
Lee D  Quessy S 《Vision research》2003,43(13):1455-1463
Whether searching for targets in a familiar scene leads to improved performance was tested in monkeys. We found that search performance improved for a familiar scene when target locations were always randomized. However, when target locations repeatedly followed a predictable sequence, performance improvement for a familiar scene was manifested only for targets presented in a familiar sequence, suggesting that scene memory might be masked by the learning of target sequences. These results suggest that information about a visual scene can facilitate the performance of visual search, and that this memory is coupled to the learned sequence of target locations.  相似文献   

13.
When storing multiple objects in visual working memory, observers sometimes misattribute perceived features to incorrect locations or objects. These misattributions are called binding errors (or swaps) and have been previously demonstrated mostly in simple objects whose features are easy to encode independently and arbitrarily chosen, like colors and orientations. Here, we tested whether similar swaps can occur with real-world objects, where the connection between features is meaningful rather than arbitrary. In Experiments 1 and 2, observers were simultaneously shown four items from two object categories. Within a category, the two exemplars could be presented in either the same or different states (e.g., open/closed; full/empty). After a delay, both exemplars from one of the categories were probed, and participants had to recognize which exemplar went with which state. We found good memory for state information and exemplar information on their own, but a significant memory decrement for exemplar–state combinations, suggesting that binding was difficult for observers and swap errors occurred even for meaningful real-world objects. In Experiment 3, we used the same task, but in one-half of the trials, the locations of the exemplars were swapped at test. We found that there are more errors in general when the locations of exemplars were swapped. We concluded that the internal features of real-world objects are not perfectly bound in working memory, and location updates impair object and feature representations. Overall, we provide evidence that even real-world objects are not stored in an entirely unitized format in working memory.  相似文献   

14.
This work examines how context may influence the detection of changes in flickering scenes. Each scene contained two changes that were matched for low-level visual salience. One of the changes was of high interest to the meaning of the scene, and the other was of lower interest. High-interest changes were more readily detected. To further examine the effects of contextual significance, we inverted the scene orientation to disrupt top-down effects of global context while controlling for contributions of visual salience. In other studies, inverting scene orientation has had inconsistent effects on detection of high-interest changes. However, this experiment demonstrated that inverting scene orientation significantly reduced the advantage for high-interest changes in comparison to lower-interest changes. Thus, scene context influences the deployment of attention and change-detection performance, and this top-down influence may be disrupted by scene inversion.  相似文献   

15.
Gobell JL  Tseng CH  Sperling G 《Vision research》2004,44(12):1273-1296
We use a novel search task to investigate the spatial distribution of visual attention, developing a general model from the data. Observers distribute attention to locations defined by stripes with a high penalty for attention to intervening areas. Attended areas are defined by a square-wave grating. A target is in one of the even stripes, and ten false targets (identical to the real target) are in the odd stripes; the observer must attend the even stripes and strongly ignore the odd, reporting the location of the target. As the spatial frequency of the grating increases, performance declines. Variations on this task inform a model that incorporates stimulus input, a "low pass" attentional modulation transfer function, and an acuity function to produce a strength map from which the location with the highest strength is selected. A feature-strength map that adds to the attention map enables the model to predict the results of attention-cued conjunction search experiments, and internal noise enables it to predict the outcome of double-pass experiments and of variations in the number of false targets. The model predicted performance on a trial-by-trial basis for three observers, accounting for approximately 70% of the trials. Actual trial-to-trial variation for an observer, using the double-pass method, is about 76%. For any requested distribution of spatial attention, this general model makes a prediction of the actually achieved distribution.  相似文献   

16.
Previous change blindness studies have failed to address the importance of balancing low-level visual salience when producing experimental stimuli for a change detection task. Therefore, prior results suggesting that top-down processes influence change detection may be contaminated by low-level saliency differences in the stimuli used. Here we present a novel technique for generating semi-automated balanced modifications to a scene, handled by a genetic algorithm coupled with a computational model for bottom-up saliency. The saliency model obtains global saliency values for input images by analysing peaks in feature contrast maps. This quantification approach facilitates the generation of experimental stimuli using natural images and is an extension to a recently investigated approach using only low-level stimuli (Verma & McOwan, 2009). In this exemplar study, subjects were asked to detect changes in a flicker task containing the original scene image (A) and a synthesised modified version (A'). We find under the conditions where global saliency is balanced between A and A' as well as between all modifications (all instantiations of A') that low-level saliency is indeed a reasonable estimator of change detection performance in comparison with high-level measures such as mouse-click densities. When the saliency of the changes are similar, addition/removal changes are detected more readily than colour changes to the scene.  相似文献   

17.
For decades, working memory (WM) has been a heated research topic in the field of cognitive psychology. However, most studies on WM presented visual stimuli on a two-dimensional plane, rarely involving depth perception. Several previous studies have investigated how depth information is stored in WM, and found that WM for depth is even more limited in capacity and the memory performance is poor compared to visual WM. In the present study, we used a change detection task to investigate whether dissociating memory items by different visual features, thereby to increase their perceptual separateness, can improve WM performance for depth. Memory items presented at various depth planes were bound with different colors (Experiments 1 and 3) or sizes (Experiment 2). The memory performance for depth locations of visual stimuli with homogeneous and heterogeneous appearances were tested and compared. The results showed a consistent pattern that although separating items with various feature values did not affect the overall memory performance, the manipulation significantly improved memory performance for the middle depth locations but impaired the performance for the boundary locations when observers fixated at the center of the whole depth volume. The memory benefits of feature separation can be attributed to enhanced individuation of memory items, therefore facilitating a more balanced allocation of attention and memory resources.  相似文献   

18.
The representation of individual planar locations and features stored in working memory can be affected by the average representation. However, less is known about how the average representation affects the short-term storage of depth information. To evaluate the possible different roles of the ensemble average in working memory for planar and depth information, we used mathematical models to fit the data collected from one study on working memory for depth and 12 studies on working memory for planar information. The pattern of recalled depth was well captured by models assuming that there was a probability of reporting the average depth instead of the individual depth, compressing the recalled front-back distance of the stimulus ensemble compared to the perceived distance. However, when modeling the recalled planar information, we found that participants tended to report individual nontarget features when the target was not memorized, and the assumption of reporting average information improves the data fitting only in very few studies. These results provide evidence for our hypothesis that average depth information can be used as a substitution for individual depth information stored in working memory, but for planar visual features, the substitution of target with the average works under a constraint that the average of to-be-remembered features is readily accessible.  相似文献   

19.
During viewing of natural scenes, do low-level features guide attention, and if so, does this depend on higher-level features? To answer these questions, we studied the image category dependence of low-level feature modification effects. Subjects fixated contrast-modified regions often in natural scene images, while smaller but significant effects were observed for urban scenes and faces. Surprisingly, modifications in fractal images did not influence fixations. Further analysis revealed an inverse relationship between modification effects and higher-level, phase-dependent image features. We suggest that high- and mid-level features - such as edges, symmetries, and recursive patterns - guide attention if present. However, if the scene lacks such diagnostic properties, low-level features prevail. We posit a hierarchical framework, which combines aspects of bottom-up and top-down theories and is compatible with our data.  相似文献   

20.
Liu K  Jiang Y 《Journal of vision》2005,5(7):650-658
Previous studies have painted a conflicting picture on the amount of visual information humans can extract from viewing a natural scene briefly. Although some studies suggest that a single glimpse is sufficient to put about five visual objects in memory, others find that not much is retained in visual memory even after prolonged viewing. Here we tested subjects' visual working memory (VWM) for a briefly viewed scene image. A sample scene was presented for 250 ms and masked, followed 1000 ms later by a comparison display. We found that subjects remembered fewer than one sample object. Increasing the viewing duration to about 15 s significantly enhanced performance, with approximately five visual objects remembered. We suggest that adequate encoding of a scene into VWM requires a long duration, and that visual details can accumulate in memory provided that the viewing duration is sufficiently long.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号