首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
R Wang 《Neural networks》2001,14(8):1061-1073
A neural network and the associated learning algorithm are presented as a generic approach for invariant recognition of visual patterns independent of their geometric attributes, such as spatial location, orientation and scale. The network is a multi-layer hierarchy with each layer composed of a set of groups of nodes. The groups of the input layer represent local areas spatially arranged in the visual field according to the geometric variations. Each node in the subsequent higher layers receives input laterally from other groups of the same layer as well as vertically from the layer below. The learning that takes place in the vertical feed forward paths between layers is based on an unsupervised hybrid algorithm combining both competitive learning and Hebbian learning. As the result of the architecture and the hybrid learning, the desired invariant recognition emerges at the output layer of the network. The network can serve as a simple and biologically plausible computational model to account for the invariant object recognition in the biological visual system. Also, as the algorithm is generic and robust, it can be applied to solve various practical recognition problems.  相似文献   

2.
CONFIGR (CONtour FIgure GRound) is a computational model based on principles of biological vision that completes sparse and noisy image figures. Within an integrated vision/recognition system, CONFIGR posits an initial recognition stage which identifies figure pixels from spatially local input information. The resulting, and typically incomplete, figure is fed back to the “early vision” stage for long-range completion via filling-in. The reconstructed image is then re-presented to the recognition system for global functions such as object recognition. In the CONFIGR algorithm, the smallest independent image unit is the visible pixel, whose size defines a computational spatial scale. Once the pixel size is fixed, the entire algorithm is fully determined, with no additional parameter choices. Multi-scale simulations illustrate the vision/recognition system. Open-source CONFIGR code is available online, but all examples can be derived analytically, and the design principles applied at each step are transparent. The model balances filling-in as figure against complementary filling-in as ground, which blocks spurious figure completions. Lobe computations occur on a subpixel spatial scale. Originally designed to fill-in missing contours in an incomplete image such as a dashed line, the same CONFIGR system connects and segments sparse dots, and unifies occluded objects from pieces locally identified as figure in the initial recognition stage. The model self-scales its completion distances, filling-in across gaps of any length, where unimpeded, while limiting connections among dense image-figure pixel groups that already have intrinsic form. Long-range image completion promises to play an important role in adaptive processors that reconstruct images from highly compressed video and still camera images.  相似文献   

3.
A dissociation between the ability to recognize misoriented objects and to determine their orientation has been reported in a small number of patients with vascular lesions. In this article, we describe a 57-year-old man with probable Alzheimer' s disease who shows the same dissociation. Neuroimaging findings indicated marked hypometabolism in the posterior cortical regions, particularly the postero-superior parietal lobes. Clinically, the patient had good object recognition accompanied by severely impaired spatial abilities. The experimental investigations comprised a variety of tasks in which he identified misoriented objects, evaluated the orientation of single objects, or discriminated the orientation of simultaneously presented items. Results revealed that his object recognition was independent of orientation and was largely mediated by salient features. With respect to orientation judgements, the patient displayed a profound inability to judge the orientation of nonupright objects, but remarkably intact (though largely implicit) knowledge of the upright orientation. Strikingly, his orientation judgements were also more accurate for upside-down objects than for other orientations (i.e., 90 degrees ). We interpret these results as evidence that judgements about object orientation are facilitated when the orientation of the principal axis of the object matches that of an internal representation. We propose that the inability to determine other orientations may be due to the failure of an "axis-finding" mechanism implemented in the posterior parietal lobes, that translates between object-centered and eye-centered coordinates appropriate for guiding visual scanning.  相似文献   

4.
Many thousands of cortical neurons are activated by any single sensory stimulus, but the organization of these populations is poorly understood. For example, are neurons in mouse visual cortex--whose preferred orientations are arranged randomly--organized with respect to other response properties? Using high-speed in vivo two-photon calcium imaging, we characterized the receptive fields of up to 100 excitatory and inhibitory neurons in a 200 μm imaged plane. Inhibitory neurons had nonlinearly summating, complex-like receptive fields and were weakly tuned for orientation. Excitatory neurons had linear, simple receptive fields that can be studied with noise stimuli and system identification methods. We developed a wavelet stimulus that evoked rich population responses and yielded the detailed spatial receptive fields of most excitatory neurons in a plane. Receptive fields and visual responses were locally highly diverse, with nearby neurons having largely dissimilar receptive fields and response time courses. Receptive-field diversity was consistent with a nearly random sampling of orientation, spatial phase, and retinotopic position. Retinotopic positions varied locally on average by approximately half the receptive-field size. Nonetheless, the retinotopic progression across the cortex could be demonstrated at the scale of 100 μm, with a magnification of ≈ 10 μm/°. Receptive-field and response similarity were in register, decreasing by 50% over a distance of 200 μm. Together, the results indicate considerable randomness in local populations of mouse visual cortical neurons, with retinotopy as the principal source of organization at the scale of hundreds of micrometers.  相似文献   

5.
Object recognition is one of the most important functions of the human visual system, yet one of the least understood, this despite the fact that vision is certainly the most studied function of the brain. We understand relatively well how several processes in the cortical visual areas that support recognition capabilities take place, such as orientation discrimination and color constancy. This paper proposes a model of the development of object recognition capability, based on two main theoretical principles. The first is that recognition does not imply any sort of geometrical reconstruction, it is instead fully driven by the two dimensional view captured by the retina. The second assumption is that all the processing functions involved in recognition are not genetically determined or hardwired in neural circuits, but are the result of interactions between epigenetic influences and basic neural plasticity mechanisms. The model is organized in modules roughly related to the main visual biological areas, and is implemented mainly using the LISSOM architecture, a recent neural self-organizing map model that simulates the effects of intercortical lateral connections. This paper shows how recognition capabilities, similar to those found in brain ventral visual areas, can develop spontaneously by exposure to natural images in an artificial cortical model.  相似文献   

6.
Given the high relevance of visual input to human behavior, it is often important to precisely monitor the spatial orientation of the visual axis. One popular and accurate technique for measuring gaze orientation is based on the dual search coil. This technique does not allow for very large displacements of the subject, however, and is not robust with respect to translations of the head. More recently, less invasive procedures have been developed that record eye movements with camera-based systems attached to a helmet worn by the subject. Computational algorithms have also been developed that can calibrate eye orientation when the head's position is fixed. Given that camera-based systems measure the eye's position in its orbit, however, the reconstruction of gaze orientation is not as straightforward when the head is allowed to move. In this paper, we propose a new algorithm and calibration method to compute gaze orientation under unrestrained head conditions. Our method requires only the accurate measurement of orbital eye position (for instance, with a camera-based system), and the position of three points on the head. The calculations are expressed in terms of linear algebra, so can easily be interpreted and related to the geometry of the human body. Our calibration method has been tested experimentally and validated against independent data, proving that is it robust even under large translations, rotations, and torsions of the head.  相似文献   

7.
The natural preference for novel objects which is displayed by rats has been used as a behavioural index to test object recognition. In this series of experiments the standard spontaneous recognition task was extended to look at other types of recognition memory; memory for place (recognition that an object is in a location where previously there had been no object), memory for object in place (recognition that a specific object has changed position with another object) and memory for context (recognition that a familiar object is in a context different to that in which it was previously encountered). We also included a standard test of object recognition in which successful discrimination relied primarily on visual cues. In addition, we looked at how the differential exploration of objects varied within the 3 min of the test phase. The results showed that rats were sensitive to the changes made in all of the test conditions and that the level of discrimination varied within the 3 min test phase. In the standard condition and the context condition, the first 2 min were found to be the most sensitive period. In the two conditions involving a position change, discrimination was only evident in the first minute.  相似文献   

8.
《Neural networks》1999,12(7-8):1021-1036
A fundamental capacity of the perceptual systems and the brain in general is to deal with the novel and the unexpected. In vision, we can effortlessly recognize a familiar object under novel viewing conditions, or recognize a new object as a member of a familiar class, such as a house, a face, or a car. This ability to generalize and deal efficiently with novel stimuli has long been considered a challenging example of brain-like computation that proved extremely difficult to replicate in artificial systems. In this paper we present an approach to generalization and invariant recognition. We focus our discussion on the problem of invariance to position in the visual field, but also sketch how similar principles could apply to other domains.The approach is based on the use of a large repertoire of partial generalizations that are built upon past experience. In the case of shift invariance, visual patterns are described as the conjunction of multiple overlapping image fragments. The invariance to the more primitive fragments is built into the system by past experience. Shift invariance of complex shapes is obtained from the invariance of their constituent fragments. We study by simulations aspects of this shift invariance method and then consider its extensions to invariant perception and classification by brain-like structures.  相似文献   

9.
脑内源信号光学成像术:猫视皮质方位功能柱的活体显示   总被引:2,自引:0,他引:2  
脑内源信号光学成像术是目前为止空间分辨率最高的一种活体脑成像技术,它为大范围皮质的功能构筑研究提供了有力工具。本文介绍了应用这一技术显示活体猫视皮质的方位功能柱的方法,此方法基本上也适用于其他皮质的功能构筑研究。  相似文献   

10.
Hopfield/constraint satisfaction type networks can be used to learn (autoassociate) patterns. Random inputs to the network will sometimes converge on states which are learned patterns, and sometimes converge on states which are unlearned/spurious. It would be useful for many reasons to be able to tell whether or not a given state was learned or spurious. In this paper we present a robust and general method, based on 'energy profiles', which allows us to make this distinction. We briefly describe related research, and note links with the study of recall, recognition and familiarity in the psychological literature.  相似文献   

11.
12.
This study investigates fractional Fourier transform pre-processing of input signals to neural networks. The fractional Fourier transform is a generalization of the ordinary Fourier transform with an order parameter a. Judicious choice of this parameter can lead to overall improvement of the neural network performance. As an illustrative example, we consider recognition and position estimation of different types of objects based on their sonar returns. Raw amplitude and time-of-flight patterns acquired from a real sonar system are processed, demonstrating reduced error in both recognition and position estimation of objects.  相似文献   

13.

Spatial orientation and memory deficits are an often overlooked and potentially powerful early marker for pathological cognitive decline. Pen-and-paper tests for spatial abilities often do not coincide with actual navigational performance due to differences in spatial perspective and scale. Mobile devices are becoming increasingly useful in a clinical setting, for patient monitoring, clinical decision-making, and information management. The same devices have positional information that may be useful for a scale appropriate point-of-care test for spatial ability. We created a test for spatial orientation and memory based on pointing within a single room using the sensors in mobile phone. The test consisted of a baseline pointing condition to which all other conditions were compared, a spatial memory condition with eyes-closed, and two body rotation conditions (real or mental) where spatial updating were assessed. We examined the effectiveness of the sensors from a mobile phone for measuring pointing errors in these conditions in a sample of healthy young individuals. We found that the sensors reliably produced appropriate azimuth and elevation pointing angles for all of the 15 targets presented across multiple participants and days. Within-subject variability was below 6° elevation and 10° azimuth for the control condition. The pointing error and variability increased with task difficulty and correlated with self-report tests of spatial ability. The lessons learned from the first tests are discussed as well as the outlook of this application as a scientific and clinical bedside device. Finally, the next version of the application is introduced as an open source application for further development.

  相似文献   

14.
Previous research has demonstrated sex differences in face processing at both neural and behavioural levels. The present study examined the role of handedness and sexual orientation as mediators of this effect. We compared the performance of LH (left-handed) and RH (right-handed) heterosexual and homosexual male and female participants on a face recognition memory task. Our main findings were that homosexual males have better face recognition memory than both heterosexual males and homosexual women. We also demonstrate better face processing in women than in men. Finally, LH heterosexual participants had better face recognition than LH homosexual participants and also tended to be better than RH heterosexual participants. These findings are consistent with differences in the organisation and laterality of face-processing mechanisms as a function of sex, handedness, and sexual orientation.  相似文献   

15.
To investigate the processing of linear perspective and binocular information for action and for the perceptual judgment of depth, we presented viewers with an actual Ames trapezoidal window. The display, when presented perpendicular to the line of sight, provided perspective information for a rectangular window slanted in depth, while binocular information specified a planar surface in the fronto-parallel plane. We compared pointing towards the display-edges with perceptual judgment of their positions in depth as the display orientation was varied under monocular and binocular view. On monocular trials, pointing and depth judgment were based on the perspective information and failed to respond accurately to changes in display orientation because pictorial information did not vary sufficiently to specify the small differences in orientation. For binocular trials, pointing was based on binocular information and precisely matched the changes in display orientation whereas depth judgment was short of such adjustment and based upon both binocular and perspective-specified slant information. The finding, that on binocular trials pointing was considerably less responsive to the illusion than perceptual judgment, supports an account of two separate processing streams in the human visual system, a ventral pathway involved in object recognition and a dorsal pathway that produces visual information for the control of actions. Previously, similar differences between perception and action were explained by an alternate explanation, that is, viewers selectively attend to different parts of a display in the two tasks. The finding that under monocular view participants responded to perspective information in both the action and the perception task rules out the attention-based argument.  相似文献   

16.
OBJECTIVES: Item response theory was used to test the scalability of the Behavioral Dyscontrol Scale (BDS). The BDS assesses the control of voluntary movement, working memory and self-monitoring. Construct validity of the BDS was examined with confirmatory factor analysis. METHODS: The BDS was administered to 693 consecutive, community-dwelling visitors of a psychogeriatric day unit (424 women and 269 men between the ages of 50 and 94). Unidimensionality of the BDS was determined using Mokken's scalogram analysis. The BDS total score was correlated with other measures of executive function (Expanded Mental Control Test, category fluency, and alternating graphical sequences) and with episodic memory tests of orientation and delayed picture recognition in order to test a model of distinct latent constructs of executive functioning and episodic memory. RESULTS: Loevinger's scalability coefficient H was 0.58 for the complete item set of the BDS. Subjects can be ordered on the latent dimension of executive ability. The first eight items of the BDS (deleting the insight rating) satisfy the assumption of non-intersecting item characteristic curves (double monotonicity) which means that they comprise a Guttman-ordered scale (H = 0.60). The BDS and three independent measures of executive control strongly correlated with a latent construct of executive functioning (convergent validity). However, discriminant relations with a nonexecutive construct (recognition memory and orientation) could not be demonstrated. CONCLUSIONS: The BDS satisfies criteria for scalability according to item response theory. Its construct validity as an executive-specific measure is as yet unclear.  相似文献   

17.
We hereby describe a simple and inexpensive approach to evaluate the position and locomotion of rodents in an arena. The system is based on webcam registering of animal behaviour with subsequent analysis on customized software. Based on black/white differentiation, it provides rapid evaluation of animal position over a period of time, and can be used in a myriad of behavioural tasks in which locomotion, velocity or place preference are variables of interest. A brief review of the results obtained so far with this system and a discussion of other possible applications in behavioural neuroscience are also included. Such a system can be easily implemented in most laboratories and can significantly reduce the time and costs involved in behavioural analysis, especially in developing countries.  相似文献   

18.
The function of the CA2 region of the hippocampus is poorly understood. Although the CA1 and CA3 regions have been extensively studied, for years the CA2 region has primarily been viewed as a linking area between the two. However, the CA2 region is known to have distinct neurochemical and structural features that are different from the other parts of the hippocampus and in recent years it has been suggested that the CA2 region may play a role in the formation and/or recall of olfactory‐based memories needed for normal social behavior. Although this hypothesis has been supported by hippocampal lesion studies that have included the CA2 region, no studies have attempted to specifically lesion the CA2 region of the hippocampus in mice to determine the effects on social recognition memory and olfaction. To fill this knowledge gap, we sought to perform excitotoxic N‐methyl‐D‐aspartate lesions of the CA2 region and determine the effects on social recognition memory. We predicted that lesions of the CA2 region would impair social recognition memory. We then went on to test olfaction in CA2‐lesioned mice, as social memory requires a functional olfactory system. Consistent with our prediction, we found that CA2‐lesioned animals had impaired social recognition. These findings are significant because they confirmed that the CA2 region of the hippocampus is a part of the neural circuitry that regulates social recognition memory, which may have implications for our understanding of the neural regulation of social behavior across species.  相似文献   

19.
EEG measurements indicate the presence of common-mode, coherent oscillations shared by multiple cortical areas. In previous studies the KIII model has been introduced, which interprets the experimental observations as nonlinear, spatially distributed dynamical oscillations of locally coupled neural populations. KIII can account for the fast and robust classification and pattern recognition in sensory cortices. In order to describe selection of action, planning, and spatial orientation functions, in this paper we expand KIII into the KIV model. KIV approximates the operation of the corticostriatal-hippocampal system. KIV consists of three KI, eight KII and three KIII components, including sensory and cortical systems, as well as the hippocampus, amygdala, and the septum. KIV implements various types of dynamic neural activities. The neural activity patterns determine the emergence of global spatial encoding to implement the orientation function of a simulated animal. Our results indicate the mechanisms, which we believe support the generation of cognitive maps in the hippocampus based on the sensory input-based destabilization of cortical spatio-temporal patterns. In this paper, we describe the conceptual design of the KIV model. We outline the biological background and motivation of the basic principles that are applied to design the KIV computational model. We use the KIV model to explain how the hippocampal neural circuitry functions are constructed and controlled by the corticostriatal-hippocampal loops, supplemented with specific subcortical units. In the second part, we implement these principles using the example of the hippocampal formation as a KIII unit. We demonstrate the learning and navigation principles using the Evolving Multi-module Mobile Agent (EMMA) in 2D software environment.  相似文献   

20.
In this paper, we present a novel approach for supervised codebook learning and optimization for bag-of-words models. This type of models is frequently used in visual recognition tasks like object class recognition or human action recognition. An entity is represented as a histogram of codewords, which are traditionally clustered with unsupervised methods like k-means or random forests and then classified in a supervised way. We propose a new supervised method for joint codebook creation and class learning, which learns the cluster centers of the codebook in a goal-directed way using the class labels of the training set. As a result, the codebook is highly correlated to the recognition problem, leading to a more discriminative codebook. We propose two different learning algorithms, one based on error backpropagation and the other based on cluster label reassignment. We apply the proposed method to human action recognition from video sequences and evaluate it on the KTH data set, reporting very promising results. The proposed technique allows us to improve the discriminative power of an unsupervised learned codebook or to keep the discriminative power while decreasing the size of the learned codebook, thus decreasing the computational complexity due to the nearest neighbor search.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号