首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hwang AD  Wang HC  Pomplun M 《Vision research》2011,51(10):1192-1205
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control.  相似文献   

2.
Semantic information is important in eye movement control. An important semantic influence on gaze guidance relates to object-scene relationships: objects that are semantically inconsistent with the scene attract more fixations than consistent objects. One interpretation of this effect is that fixations are driven toward inconsistent objects because they are semantically more informative. We tested this explanation using contextualized meaning maps, a method that is based on crowd-sourced ratings to quantify the spatial distribution of context-sensitive “meaning” in images. In Experiment 1, we compared gaze data and contextualized meaning maps for images, in which objects-scene consistency was manipulated. Observers fixated more on inconsistent versus consistent objects. However, contextualized meaning maps did not assign higher meaning to image regions that contained semantic inconsistencies. In Experiment 2, a large number of raters evaluated image-regions, which were deliberately selected for their content and expected meaningfulness. The results suggest that the same scene locations were experienced as slightly less meaningful when they contained inconsistent compared to consistent objects. In summary, we demonstrated that — in the context of our rating task — semantically inconsistent objects are experienced as less meaningful than their consistent counterparts and that contextualized meaning maps do not capture prototypical influences of image meaning on gaze guidance.  相似文献   

3.
Visual search in natural scenes is a complex task relying on peripheral vision to detect potential targets and central vision to verify them. The segregation of the visual fields has been particularly established by on-screen experiments. We conducted a gaze-contingent experiment in virtual reality in order to test how the perceived roles of central and peripheral visions translated to more natural settings. The use of everyday scenes in virtual reality allowed us to study visual attention by implementing a fairly ecological protocol that cannot be implemented in the real world. Central or peripheral vision was masked during visual search, with target objects selected according to scene semantic rules. Analyzing the resulting search behavior, we found that target objects that were not spatially constrained to a probable location within the scene impacted search measures negatively. Our results diverge from on-screen studies in that search performances were only slightly affected by central vision loss. In particular, a central mask did not impact verification times when the target was grammatically constrained to an anchor object. Our findings demonstrates that the role of central vision (up to 6 degrees of eccentricities) in identifying objects in natural scenes seems to be minor, while the role of peripheral preprocessing of targets in immersive real-world searches may have been underestimated by on-screen experiments.  相似文献   

4.
How direction of illumination affects visually perceived surface roughness   总被引:1,自引:0,他引:1  
We examined visual estimation of surface roughness using random, computer-generated, three-dimensional (3D) surfaces rendered under a mixture of diffuse lighting and a punctate source. The angle between the tangent to the plane containing the surface texture and the direction to the punctate source was varied from 50 to 70 deg across lighting conditions. Observers were presented with pairs of surfaces under different lighting conditions and indicated which 3D surface appeared rougher. Surfaces were viewed either in isolation or in scenes with added objects whose shading, cast shadows, and specular highlights provided information about the spatial distribution of illumination. All observers perceived surfaces to be markedly rougher with decreasing illuminant angle. Performance in scenes with added objects was no closer to constant than that in scenes without added objects. We identified four novel cues that are valid cues to roughness under any single lighting condition but that are not invariant under changes in lighting condition. We modeled observers' deviations from roughness constancy as a weighted linear combination of these "pseudocues" and found that they account for a substantial amount of observers' systematic deviations from roughness constancy with changes in lighting condition.  相似文献   

5.
There is accumulating evidence that scene context can guide and facilitate visual search (e.g., A. Torralba, A. Oliva, M. S. Castelhano, & J. M. Henderson, 2006). Previous studies utilized stimuli of restricted size, a fixed head position, and context defined by the global spatial configuration of the scene. Thus, it is unknown whether similar effects generalize to natural viewing environments and to context defined by local object co-occurrence. Here, with a mobile eye tracker, we investigated the effects of object co-occurrence on search performance under naturalistic conditions. Observers searched for low-visibility target objects on tables cluttered with everyday objects. Targets were either located adjacent to larger, more visible "cue" objects that they regularly co-occurred in natural scenes (expected condition) or elsewhere in the display, surrounded by unrelated objects (unexpected condition). Mean search times were shorter for targets at expected locations as compared to unexpected locations. Additionally, context guided eye movements, as more fixations were directed toward cue objects than other non-target objects, particularly when the cue was contextually relevant to the current search target. These results could not be accounted for by image saliency models. Thus, we conclude that object co-occurrence can serve as a contextual cue to facilitate search and guide eye movements in natural environments.  相似文献   

6.
Culture shapes how people gather information from the visual world. We recently showed that Western observers focus on the eyes region during face recognition, whereas Eastern observers fixate predominantly the center of faces, suggesting a more effective use of extrafoveal information for Easterners compared to Westerners. However, the cultural variation in eye movements during scene perception is a highly debated topic. Additionally, the extent to which those perceptual differences across observers from different cultures rely on modulations of extrafoveal information use remains to be clarified. We used a gaze-contingent technique designed to dynamically mask central vision, the Blindspot, during a visual search task of animals in natural scenes. We parametrically controlled the Blindspots and target animal sizes (0°, 2°, 5°, or 8°). We processed eye-tracking data using an unbiased data-driven approach based on fixation maps and we introduced novel spatiotemporal analyses in order to finely characterize the dynamics of scene exploration. Both groups of observers, Eastern and Western, showed comparable animal identification performance, which decreased as a function of the Blindspot sizes. Importantly, dynamic analysis of the exploration pathways revealed identical oculomotor strategies for both groups of observers during animal search in scenes. Culture does not impact extrafoveal information use during the ecologically valid visual search of animals in natural scenes.  相似文献   

7.
When we view a scene, we generally feel that we have a rich representation of that scene. Recent research has shown, however, that we are unable to detect relatively large changes in scenes, which suggests an inability to retain the visual details from one scene view to the next. In the present study, we investigated whether we can retain and make use of global and semantic information from a scene in order to efficiently detect changes from one scene to the next. Results indicated that change detection was practically independent of scene disruption with one exception. Better performance in the meaningful scenes was observed only in the whole-scene presentation condition where the participants knew that the stimulus was extracted from the meaningful scene.  相似文献   

8.
How we find what we are looking for in complex visual scenes is a seemingly simple ability that has taken half a century to unravel. The first study to use the term visual search showed that as the number of objects in a complex scene increases, observers’ reaction times increase proportionally (Green & Anderson, 1956). This observation suggests that our ability to process the objects in the scenes is limited in capacity. However, if it is known that the target will have a certain feature attribute, for example, that it will be red, then only an increase in the number of red items increases reaction time. This observation suggests that we can control which visual inputs receive the benefit of our limited capacity to recognize the objects, such as those defined by the color red, as the items we seek. The nature of the mechanisms that underlie these basic phenomena in the literature on visual search have been more difficult to definitively determine. In this paper, I discuss how electrophysiological methods have provided us with the necessary tools to understand the nature of the mechanisms that give rise to the effects observed in the first visual search paper. I begin by describing how recordings of event-related potentials from humans and nonhuman primates have shown us how attention is deployed to possible target items in complex visual scenes. Then, I will discuss how event-related potential experiments have allowed us to directly measure the memory representations that are used to guide these deployments of attention to items with target-defining features.  相似文献   

9.

Background

This study investigated whether realistic immersive conditions with dynamic indoor scenes presented on a large, hemispheric panoramic screen covering 180° of the visual field improved the visual search abilities of participants with age‐related macular degeneration (AMD).

Method

Twenty‐one participants with AMD, 16 age‐matched controls and 16 young observers were included. Realistic indoor scenes were presented on a panoramic five metre diameter screen. Twelve different objects were used as targets. The participants were asked to search for a target object, shown on paper before each trial, within a room composed of various objects. A joystick was used for navigation within the scene views. A target object was present in 24 trials and absent in 24 trials. The percentage of correct detection of the target, the percentage of false alarms (that is, the detection of the target when it was absent), the number of scene views explored and the search time were measured.

Results

The search time was slower for participants with AMD than for the age‐matched controls, who in turn were slower than the young participants. The participants with AMD were able to accomplish the task with a performance of 75 per cent correct detections. This was slightly lower than older controls (79.2 per cent) while young controls were at ceiling (91.7 per cent). Errors were mainly due to false alarms resulting from confusion between the target object and another object present in the scene in the target‐absent trials.

Conclusion

The outcomes of the present study indicate that, under realistic conditions, although slower than age‐matched, normally sighted controls, participants with AMD were able to accomplish visual searches of objects with high accuracy.  相似文献   

10.
The direction in which people tend to move their eyes when inspecting images can reveal the different influences on eye guidance in scene perception, and their time course. We investigated biases in saccade direction during a memory-encoding task with natural scenes and computer-generated fractals. Images were rotated to disentangle egocentric and image-based guidance. Saccades in fractals were more likely to be horizontal, regardless of orientation. In scenes, the first saccade often moved down and subsequent eye movements were predominantly vertical, relative to the scene. These biases were modulated by the distribution of visual features (saliency and clutter) in the scene. The results suggest that image orientation, visual features and the scene frame-of-reference have a rapid effect on eye guidance.  相似文献   

11.
The present study used change detection tasks to examine whether there is an advantage of a ground surface in representing visual scenes. In 6 experiments, a flicker paradigm (Experiments 1 through 4) or a one-shot paradigm (Experiments 5 and 6) was used to examine whether changes on a ground surface were easier to detect than changes on a ceiling surface. Overall, we found that: (1) there was an advantage in detecting changes on a ground surface or changes to objects on a ground surface; (2) this advantage was dependent on the presence of a coherent ground surface; (3) this advantage could propagate to objects connected to the ground surface through "nested" contact relations; (4) this advantage was mainly due to improved encoding rather than improved retrieval and comparison of the ground surface; and (5) this advantage was dependent on the presentation duration of the scene but not the number of objects presented in the scene. Together, these results suggest a unique role of the ground surface in organizing visual scenes.  相似文献   

12.
Visual scene memory and the guidance of saccadic eye movements.   总被引:2,自引:0,他引:2  
D Melcher  E Kowler 《Vision research》2001,41(25-26):3597-3611
An unresolved question is how much information can be remembered from visual scenes when they are inspected by saccadic eye movements. Subjects used saccadic eye movements to scan a computer-generated scene, and afterwards, recalled as many objects as they could. Scene memory was quite good: it improved with display duration, it persisted over time long after the display was removed, and it continued to accumulate with additional viewings of the same display (Melcher, D. (2001) The persistance of memory for scenes. Nature 412, 401). The occurrence of saccadic eye movements was important to ensure good recall performance, even though subjects often recalled non-fixated objects. Inter-saccadic intervals increased with display duration, showing an influence of duration on global scanning strategy. The choice of saccadic target was predicted by a Random Selection with Distance Weighting (RSDW) model, in which the target for each saccade is selected at random from all available objects, weighted according to distance from fixation, regardless of which objects had previously been fixated. The results show that the visual memory that was reflected in the recall reports was not utilized for the immediate decision about where to look in the scene. Visual memory can be excellent, but it is not always reflected in oculomotor measures, perhaps because the cost of rapid on-line memory retrieval is too great.  相似文献   

13.
Champion RA  Warren PA 《Vision research》2008,48(17):1820-1830
In order to compute a representation of an object's size within a 3D scene, the visual system must scale retinal size by an estimate of the distance to the object. Evidence from size discrimination and visual search studies suggests that we have no access to the representation of retinal size when performing such tasks. In this study we investigate whether observers have early access to retinal size prior to scene size. Observer performance was assessed in a visual search task (requiring search within a 3D scene) in which processing was interrupted at a range of short presentation times. If observers have access to retinal size then we might expect to find a presentation time before which observers behave as if using retinal size and after which they behave as if using scene size. Observers searched for a larger or smaller target object within a group of objects viewed against a textured plane slanted at 0 degrees or 60 degrees . Stimuli were presented for 100, 200, 400 or 800ms and immediately followed by a mask. We measured the effect of target location within a stimulus (near vs. far) on task performance and how this was influenced by the background slant. The results of experiments 1 and 2 suggest that background slant had a significant influence on performance at all presentation times consistent with the use of scene size and not retinal size. Experiment 3 shows that this finding cannot be explained by a 2D texture contrast effect. Experiment 4 indicates that contextual information learned across a block of trials could be an important factor in such visual search experiments. In spite of this finding, our results suggest that distance scaling may occur prior to 100ms and we find no clear evidence for explicit access to a retinal representation of size.  相似文献   

14.
Scene context guides eye movements during visual search   总被引:2,自引:0,他引:2  
How does scene context guide search behavior to likely target locations? We had observers search for scene-constrained and scene-unconstrained targets, and found that scene-constrained targets were detected faster and with fewer eye movements. Observers also directed more initial saccades to target-consistent scene regions and devoted more time to searching these regions. However, final checking fixations on target-inconsistent regions were common in target-absent trials, suggesting that scene context does not strictly confine search to likely target locations. We interpret these data as evidence for a rapid top-down biasing of search behavior by scene context to the target-consistent regions of a scene.  相似文献   

15.
Humans can remember many scenes for a long time after brief presentation. Do scene understanding and encoding processes require visual selective attention, or do they occur even when observers are engaged in other visual tasks? We showed observers scene or texture images while they performed a visual search task, an auditory detection task, or no concurrent task. Concurrent tasks interfered with memory for both image types. Visual search interfered more than effects of auditory detection even when the two tasks were equally difficult. The same pattern of results was obtained with concurrent tasks presented during the encoding or consolidation phases. We conclude that visual attention modulates picture memory performance. We did not find any aspect of picture memory to be independent of attentional demands.  相似文献   

16.
Background: People with abnormal colour vision often report difficulty seeing coloured berries and flowers in foliage, which suggests they will have a diminished capacity for visual search when target objects are marked out by colour. There is very little experimental evidence of the effect of abnormal colour vision on visual search and none relating to search for objects in natural foliage. Method: We showed 79 subjects with abnormal colour vision (seven protanopes, 10 deuteranopes, 16 protanomals and 46 deuteranomals) and 20 subjects with normal colour vision photographs of natural scenes and asked them to locate clumps of red berries, to trace the length of a red string on grass and to name the season depicted in a photograph taken in the Autumn and the same scene photographed in the Summer. Colour vision was assessed using the Ishihara, the Medmont C100, the Farnsworth D15, the Richmond HRR and the Nagel anomaloscope. Results: All the subjects with abnormal colour vision located fewer clumps of red berries than those with normal colour vision. The subjects who failed the Farnsworth D15 performed significantly worse than those who passed but the distribution of scores in the two groups overlaps. The majority of subjects with abnormal colour vision could not trace the full length of the string: only 38 per cent of anomalous trichromats who passed the Farnsworth D15 test and three per cent of those who failed it were able to trace the full length of the string. Fifty‐five per cent of those classed as having a mild deficiency by the HRR test could trace the whole string. Most dichromats were unable to identify the Autumn season and those who did may have been assisted by guessing. Most (94 per cent) of those who passed the Farnsworth D15 test and all those classified as having a ‘mild’ deficiency by the HRR test could identify the season. Conclusions: All people with abnormal colour vision, even those with a very mild deficiency, have some degree of impairment of their ability to see coloured objects in natural surroundings. A pass at the Farnsworth D15 test or a ‘mild’ classification with the Richmond HRR test identifies those likely to have the least problems with visual search and identification tasks. The results have practical implications for the selection of personnel in occupations that involve visual search in natural terrain.  相似文献   

17.
Top-down knowledge about the target is essential in visual search. It biases visual attention to information that matches the target-defining criteria. Extensive research in the past has examined visual search when the target is defined by fixed criteria throughout the experiment, with few studies investigating how subjects set up the target. To address this issue, we conducted five experiments using random polygons and real-world objects, allowing the target criteria to change from trial to trial. On each trial, subjects first see a cue informing them about the target, followed 200-1000 ms later by the search array. We find that when the cue matches the target exactly, search speed increases and the slope of response time-set size function decreases. Deviations from the exact match in size or orientation slow down search speed, although they lead to faster speed compared with a neutral cue or a semantic cue. We conclude that the template set-up process uses detailed visual information, rather than schematic or semantic information, to find the target.  相似文献   

18.
Eye position was recorded in different viewing conditions to assess whether the temporal and spatial characteristics of saccadic eye movements in different individuals are idiosyncratic. Our aim was to determine the degree to which oculomotor control is based on endogenous factors. A total of 15 naive subjects viewed five visual environments: (1) The absence of visual stimulation (i.e. a dark room); (2) a repetitive visual environment (i.e. simple textured patterns); (3) a complex natural scene; (4) a visual search task; and (5) reading text. Although differences in visual environment had significant effects on eye movements, idiosyncrasies were also apparent. For example, the mean fixation duration and size of an individual's saccadic eye movements when passively viewing a complex natural scene covaried significantly with those same parameters in the absence of visual stimulation and in a repetitive visual environment. In contrast, an individual's spatio-temporal characteristics of eye movements during active tasks such as reading text or visual search covaried together, but did not correlate with the pattern of eye movements detected when viewing a natural scene, simple patterns or in the dark. These idiosyncratic patterns of eye movements in normal viewing reveal an endogenous influence on oculomotor control. The independent covariance of eye movements during different visual tasks shows that saccadic eye movements during active tasks like reading or visual search differ from those engaged during the passive inspection of visual scenes.  相似文献   

19.
Experimental data on the accuracy and frequency of saccades are incorporated into a model of the visual world and eye movements to determine the spatial distribution of visual objects on the retina. Visual scenes are represented as sequences of discrete small objects whose positions are initially uniformly distributed and then moved toward the center of the retina by eye movements. We then use this model to investigate whether the distribution of cones in the retina maximizes the information transferred about object position. Assuming for simplicity that a single cone is activated by the object, the rate of information transfer is maximized at the receptor stage if the probability that a target lies at a position on the retina is proportional to the local cone density. Although qualitatively it is easy to understand why the cone density is higher at the fovea, by linking the cone density with eye movements through information sampling theory, we provide an explanation for its quantitative variation across the retina. The human cone distribution and the object distribution in our model visual world are shown to have the same general form and are in close agreement between 5- and 30-deg eccentricity.  相似文献   

20.
Visual cognition   总被引:1,自引:0,他引:1  
Cavanagh P 《Vision research》2011,51(13):1538-1551
Visual cognition, high-level vision, mid-level vision and top-down processing all refer to decision-based scene analyses that combine prior knowledge with retinal input to generate representations. The label “visual cognition” is little used at present, but research and experiments on mid- and high-level, inference-based vision have flourished, becoming in the 21st century a significant, if often understated part, of current vision research. How does visual cognition work? What are its moving parts? This paper reviews the origins and architecture of visual cognition and briefly describes some work in the areas of routines, attention, surfaces, objects, and events (motion, causality, and agency). Most vision scientists avoid being too explicit when presenting concepts about visual cognition, having learned that explicit models invite easy criticism. What we see in the literature is ample evidence for visual cognition, but few or only cautious attempts to detail how it might work. This is the great unfinished business of vision research: at some point we will be done with characterizing how the visual system measures the world and we will have to return to the question of how vision constructs models of objects, surfaces, scenes, and events.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号