首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Nowadays, image recognition has become a highly active research topic in cognitive computation community, due to its many potential applications. Generally, the image recognition task involves two subtasks: image representation and image classification. Most feature extraction approaches for image representation developed so far regard independent component analysis (ICA) as one of the essential means. However, ICA has been hampered by its extremely expensive computational cost in real-time implementation. To address this problem, a fast cognitive computational scheme for image recognition is presented in this paper, which combines ICA and the extreme learning machine (ELM) algorithm. It tries to solve the image recognition problem at a much faster speed by using ELM not only in image classification but also in feature extraction for image representation. As an example, our proposed approach is applied to the face image recognition with detailed analysis. Firstly, common feature hypothesis is introduced to extract the common visual features from universal images by the traditional ICA model in the offline recognition process, and then ELM is used to simulate ICA for the purpose of facial feature extraction in the online recognition process. Lastly, the resulting independent feature representation of the face images extracted by ELM rather than ICA will be fed into the ELM classifier, which is composed of numerous single hidden layer feed-forward networks. Experimental results on Yale face database and MNIST digit database have shown the good performance of our proposed approach, which could be comparable to the state-of-the-art techniques at a much faster speed.  相似文献   

2.
Pedestrian identification is a very important topic in the area of intelligent surveillance and public safety, where the near front face images of pedestrian can hardly be obtained due to high installation angle of camera, long-distance location and extreme light variations. This paper presents a new action-based pedestrian identification algorithm, which adopts hierarchical matching pursuit (HMP) to extract features and order preserving sparse coding (OPSC) to do classification. Two-layer HMP features are extracted from foreground frame image patches by sparse coding, max pooling and normalization, which preserve both local and global information. OPSC is taken as classifier to take full advantage of the spatial structure information, which is different from traditional temporal OPSC algorithm. The spatiotemporal order preserving sparse coding-based classification is also investigated. The effectiveness of the proposed method is verified on public data sets, and the experimental results show the superiority of our method.  相似文献   

3.
In feature learning field, many methods are inspired by advances in neuroscience. Among them, neural network and sparse coding have been broadly studied. Predictive sparse decomposition (PSD) is a practical variant of these two methods. It trains a neural network to estimate the sparse codes. After training, the neural network is fine-tuned to achieve higher performance on object recognition tasks. It is widely believed that introducing discriminative information can make the features more useful for classification task. Hence, in this work, we propose applying the task-driven dictionary learning framework to the PSD and demonstrate that this new model can be optimized by the stochastic gradient descent (SGD) algorithm. Before our work, the semi-supervised auto-encoder framework has already been proposed to guide neural network to extract discriminative representations. But it does not improve the classification performance of neural network. In the experiments, we compare the proposed method with the semi-supervised auto-encoder method. The performance of PSD is used as the baseline for these two methods. On the MNIST and USPS datasets, our method can generate more discriminative and predictable sparse codes than other methods. Furthermore, the recognition accuracy of neural network can be improved.  相似文献   

4.
Extreme learning machine (ELM) has been extensively studied, due to its fast training and good generalization. Unfortunately, the existing ELM-based feature representation methods are uncompetitive with state-of-the-art deep neural networks (DNNs) when conducting some complex visual recognition tasks. This weakness is mainly caused by two critical defects: (1) random feature mappings (RFM) by ad hoc probability distribution is unable to well project various input data into discriminative feature spaces; (2) in the ELM-based hierarchical architectures, features from previous layer are scattered via RFM in the current layer, which leads to abstracting higher level features ineffectively. To address these issues, we aim to take advantage of label information for optimizing random mapping in the ELM, utilizing an efficient label alignment metric to learn a conditional random feature mapping (CRFM) in a supervised manner. Moreover, we proposed a new CRFM-based single-layer ELM (CELM) and then extended CELM to the supervised multi-layer learning architecture (ML-CELM). Extensive experiments on various widely used datasets demonstrate our approach is more effective than original ELM-based and other existing DNN feature representation methods with rapid training/testing speed. The proposed CELM and ML-CELM are able to achieve discriminative and robust feature representation, and have shown superiority in various simulations in terms of generalization and speed.  相似文献   

5.
Extreme learning machine (ELM) is proposed for solving a single-layer feed-forward network (SLFN) with fast learning speed and has been confirmed to be effective and efficient for pattern classification and regression in different fields. ELM originally focuses on the supervised, semi-supervised, and unsupervised learning problems, but just in the single domain. To our best knowledge, ELM with cross-domain learning capability in subspace learning has not been exploited very well. Inspired by a cognitive-based extreme learning machine technique (Cognit Comput. 6:376–390, 1; Cognit Comput. 7:263–278, 2.), this paper proposes a unified subspace transfer framework called cross-domain extreme learning machine (CdELM), which aims at learning a common (shared) subspace across domains. Three merits of the proposed CdELM are included: (1) A cross-domain subspace shared by source and target domains is achieved based on domain adaptation; (2) ELM is well exploited in the cross-domain shared subspace learning framework, and a new perspective is brought for ELM theory in heterogeneous data analysis; (3) the proposed method is a subspace learning framework and can be combined with different classifiers in recognition phase, such as ELM, SVM, nearest neighbor, etc. Experiments on our electronic nose olfaction datasets demonstrate that the proposed CdELM method significantly outperforms other compared methods.  相似文献   

6.
This paper proposes an efficient finger vein recognition system, in which a variant of the original ensemble extreme learning machine (ELM) called the feature component-based ELMs (FC-ELMs) designed to utilize the characteristics of the features, is introduced to improve the recognition accuracy and stability and to substantially reduce the number of hidden nodes. For feature extraction, an explicit guided filter is proposed to extract the eight block-based directional features from the high-quality finger vein contours obtained from noisy, non-uniform, low-contrast finger vein images without introducing any segmentation process. An FC-ELMs consist of eight single ELMs, each trained with a block feature with a pre-defined direction to enhance the robustness against variation of the finger vein images, and an output layer to combine the outputs of the eight ELMs. For the structured training of the vein patterns, the FC-ELMs are designed to first train small differences between patterns with the same angle and then to aggregate the differences at the output layer. Each ELM can easily learn lower-complexity patterns with a smaller network and the matching accuracy can also be improved, due to the less complex boundaries required for each ELM. We also designed the ensemble FC-ELMs to provide the matching system with stability. For the dataset considered, the experimental results show that the proposed system is able to generate clearer vein contours and has good matching performance with an accuracy of 99.53 % and speed of 0.87 ms per image.  相似文献   

7.
According to the research results reported in the past decades, it is well acknowledged that face recognition is not a trivial task. With the development of electronic devices, we are gradually revealing the secret of object recognition in the primate’s visual cortex. Therefore, it is time to reconsider face recognition by using biologically inspired features. In this paper, we represent face images by utilizing the C1 units, which correspond to complex cells in the visual cortex, and pool over S1 units by using a maximum operation to reserve only the maximum response of each local area of S1 units. The new representation is termed C1 Face. Because C1 Face is naturally a third-order tensor (or a three dimensional array), we propose three-way discriminative locality alignment (TWDLA), an extension of the discriminative locality alignment, which is a top-level discriminate manifold learning-based subspace learning algorithm. TWDLA has the following advantages: (1) it takes third-order tensors as input directly so the structure information can be well preserved; (2) it models the local geometry over every modality of the input tensors so the spatial relations of input tensors within a class can be preserved; (3) it maximizes the margin between a tensor and tensors from other classes over each modality so it performs well for recognition tasks and (4) it has no under sampling problem. Extensive experiments on YALE and FERET datasets show (1) the proposed C1Face representation can better represent face images than raw pixels and (2) TWDLA can duly preserve both the local geometry and the discriminative information over every modality for recognition.  相似文献   

8.
9.
Dimension reduction is a challenge task in data processing, especially in high-dimensional data processing area. Non-negative matrix factorization (NMF), as a classical dimension reduction method, has a contribution to the parts-based representation for the characteristics of non-negative constraints in the NMF algorithm. In this paper, the NMF algorithm is introduced to extract local features for dimension reduction. Considering the problem of which NMF is required to define the number of the decomposition rank manually, we proposed a rank-adaptive NMF algorithm, in which the affinity propagation (AP) clustering algorithm is introduced to determine adaptively the number of the decomposition rank of NMF. Then, the rank-adaptive NMF algorithm is used to extract features for the original image. After that, a low-dimensional representation of the original image is obtained through the projection from the original images to the feature space. Finally, we used extreme learning machine (ELM) and k-nearest neighbor (KNN) as the classifier to classify those low-dimensional feature representations. The experimental results demonstrate that the decomposition rank determined by the AP clustering algorithm can reflect the characteristics of the original data. When it is combined with the classification algorithm ELM or KNN and applied to handwritten character recognition, the proposed method not only reduces the dimension of original images but also performs well in terms of classification accuracy and time consumption. A new rank-adaptive NMF algorithm is proposed based on the AP clustering algorithm and the original NMF algorithm. According to this algorithm, the low-dimensional representation of the original data can be obtained without any prior knowledge. In addition, the proposed rank-adaptive NMF algorithm combined with the ELM and KNN classification algorithms performs well.  相似文献   

10.
Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao et al. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods.  相似文献   

11.
Our ability to discriminate and recognize thousands of faces despite their similarity as visual patterns relies on adaptive, norm-based, coding mechanisms that are continuously updated by experience. Reduced adaptive coding of face identity has been proposed as a neurocognitive endophenotype for autism, because it is found in autism and in relatives of individuals with autism. Autistic traits can also extend continuously into the general population, raising the possibility that reduced adaptive coding of face identity may be more generally associated with autistic traits. In the present study, we investigated whether adaptive coding of face identity decreases as autistic traits increase in an undergraduate population. Adaptive coding was measured using face identity aftereffects, and autistic traits were measured using the Autism-Spectrum Quotient (AQ) and its subscales. We also measured face and car recognition ability to determine whether autistic traits are selectively related to face recognition difficulties. We found that men who scored higher on levels of autistic traits related to social interaction had reduced adaptive coding of face identity. This result is consistent with the idea that atypical adaptive face-coding mechanisms are an endophenotype for autism. Autistic traits were also linked with face-selective recognition difficulties in men. However, there were some unexpected sex differences. In women, autistic traits were linked positively, rather than negatively, with adaptive coding of identity, and were unrelated to face-selective recognition difficulties. These sex differences indicate that autistic traits can have different neurocognitive correlates in men and women and raise the intriguing possibility that endophenotypes of autism can differ in males and females.  相似文献   

12.
For groupwise image registration, graph theoretic methods have been adopted for discovering the manifold of images to be registered so that accurate registration of images to a group center image can be achieved by aligning similar images that are linked by the shortest graph paths. However, the image similarity measures adopted to build a graph of images in the extant methods are essentially pairwise measures, not effective for capturing the groupwise similarity among multiple images. To overcome this problem, we present a groupwise image similarity measure that is built on sparse coding for characterizing image similarity among all input images and build a directed graph (digraph) of images so that similar images are connected by the shortest paths of the digraph. Following the shortest paths determined according to the digraph, images are registered to a group center image in an iterative manner by decomposing a large anatomical deformation field required to register an image to the group center image into a series of small ones between similar images. During the iterative image registration, the digraph of images evolves dynamically at each iteration step to pursue an accurate estimation of the image manifold. Moreover, an adaptive dictionary strategy is adopted in the groupwise image similarity measure to ensure fast convergence of the iterative registration procedure. The proposed method has been validated based on both simulated and real brain images, and experiment results have demonstrated that our method was more effective for learning the manifold of input images and achieved higher registration accuracy than state-of-the-art groupwise image registration methods.  相似文献   

13.
14.
Recently, social networks and other forms of media communication have been gathering the interest of both the scientific and the business world, leading to the increasing development of the science of opinion and sentiment analysis. Facing the huge amount of information present on the Web represents a crucial task and leads to the study and creation of efficient models able to tackle the task. To this end, current research proposes an efficient approach to support emotion recognition and polarity detection in natural language text. In this paper, we show how the most recent advances in statistical learning theory (SLT) can support the development of an efficient extreme learning machine (ELM) and the assessment of the resultant model’s performance when applied to big social data analysis. ELM, developed to overcome some issues in back-propagation networks, represents a powerful learning tool. However, the main problem is represented by the necessity to cope with a large number of available samples, and the generalization performance has to be carefully assessed. For this reason, we propose an ELM implementation that exploits the Spark distributed in memory technology and show how to take advantage of SLT results in order to select ELM hyperparameters able to provide the best generalization performance.  相似文献   

15.
CONFIGR (CONtour FIgure GRound) is a computational model based on principles of biological vision that completes sparse and noisy image figures. Within an integrated vision/recognition system, CONFIGR posits an initial recognition stage which identifies figure pixels from spatially local input information. The resulting, and typically incomplete, figure is fed back to the “early vision” stage for long-range completion via filling-in. The reconstructed image is then re-presented to the recognition system for global functions such as object recognition. In the CONFIGR algorithm, the smallest independent image unit is the visible pixel, whose size defines a computational spatial scale. Once the pixel size is fixed, the entire algorithm is fully determined, with no additional parameter choices. Multi-scale simulations illustrate the vision/recognition system. Open-source CONFIGR code is available online, but all examples can be derived analytically, and the design principles applied at each step are transparent. The model balances filling-in as figure against complementary filling-in as ground, which blocks spurious figure completions. Lobe computations occur on a subpixel spatial scale. Originally designed to fill-in missing contours in an incomplete image such as a dashed line, the same CONFIGR system connects and segments sparse dots, and unifies occluded objects from pieces locally identified as figure in the initial recognition stage. The model self-scales its completion distances, filling-in across gaps of any length, where unimpeded, while limiting connections among dense image-figure pixel groups that already have intrinsic form. Long-range image completion promises to play an important role in adaptive processors that reconstruct images from highly compressed video and still camera images.  相似文献   

16.
Predictive coding has been proposed as a model of the hierarchical perceptual inference process performed in the cortex. However, results demonstrating that predictive coding is capable of performing the complex inference required to recognise objects in natural images have not previously been presented. This article proposes a hierarchical neural network based on predictive coding for performing visual object recognition. This network is applied to the tasks of categorising hand-written digits, identifying faces, and locating cars in images of street scenes. It is shown that image recognition can be performed with tolerance to position, illumination, size, partial occlusion, and within-category variation. The current results, therefore, provide the first practical demonstration that predictive coding (at least the particular implementation of predictive coding used here; the PC/BC-DIM algorithm) is capable of performing accurate visual object recognition.  相似文献   

17.
Since extreme learning machine (ELM) was proposed, hundreds of studies have been conducted on this subject in various areas, from theoretical researches to practical applications. However, there are very few papers in the literature to reveal the reasons why in ELM classification the class with the highest output value is being chosen as the predicted class for a given input. In order to give a clear insight into this question, this paper analyzes the rationality of ELM reasoning from the perspective of its inductive bias. The analysis results show that the choice of highest output in ELM is reasonable for both binary and multiclass classification problems. In addition, to deal with multiclass problems ELM uses the well-known one-against-all strategy, in which unclassifiable regions may exist. This paper also gives a clear explanation on how ELM resolves the unclassifiable regions, through both analysis and experiments.  相似文献   

18.
A computational principle for hippocampal learning and neurogenesis   总被引:8,自引:0,他引:8  
Becker S 《Hippocampus》2005,15(6):722-738
In the three decades since Marr put forward his computational theory of hippocampal coding, many computational models have been built on the same key principles proposed by Marr: sparse representations, rapid Hebbian storage, associative recall and consolidation. Most of these models have focused on either the CA3 or CA1 fields, using "off-the-shelf" learning algorithms such as competitive learning or Hebbian pattern association. Here, we propose a novel coding principle that is common to all hippocampal regions, and from this one principal, we derive learning rules for each of the major pathways within the hippocampus. The learning rules turn out to have much in common with several models of CA3 and CA1 in the literature, and provide a unifying framework in which to view these models. Simulations of the complete circuit confirm that both recognition memory and recall are superior relative to a hippocampally lesioned model, consistent with human data. Further, we propose a functional role for neurogenesis in the dentate gyrus (DG), namely, to create distinct memory traces for highly similar items. Our simulation results support our prediction that memory capacity increases with the number of dentate granule cells, while neuronal turnover with a fixed dentate layer size improves recall, by minimizing interference between highly similar items.  相似文献   

19.
Sparse-representation-based classification (SRC), which classifies data based on the sparse reconstruction error, has been a new technique in pattern recognition. However, the computation cost for sparse coding is heavy in real applications. In this paper, various dimension reduction methods are studied in the context of SRC to improve classification accuracy as well as reduce computational cost. A feature extraction method, i.e., principal component analysis, and feature selection methods, i.e., Laplacian score and Pearson correlation coefficient, are applied to the data preparation step to preserve the structure of data in the lower-dimensional space. Classification performance of SRC with structure-preserving dimension reduction (SRC–SPDR) is compared to classical classifiers such as k-nearest neighbors and support vector machines. Experimental tests with the UCI and face data sets demonstrate that SRC–SPDR is effective with relatively low computation cost  相似文献   

20.
An emerging paradigm analyses in what respect the properties of the nervous system reflect properties of natural scenes. It is hypothesized that neurons form sparse representations of natural stimuli: each neuron should respond strongly to some stimuli while being inactive upon presentation of most others. For a given network, sparse representations need fewest spikes, and thus the nervous system can consume the least energy. To obtain optimally sparse responses the receptive fields of simulated neurons are optimized. Algorithmically this is identical to searching for basis functions that allow coding for the stimuli with sparse coefficients. The problem is identical to maximizing the log likelihood of a generative model with prior knowledge of natural images. It is found that the resulting simulated neurons share most properties of simple cells found in primary visual cortex. Thus, forming optimally sparse representations is a very compact approach to describing simple cell properties. Many ways of defining sparse responses exist and it is widely believed that the particular choice of the sparse prior of the generative model does not significantly influence the estimated basis functions. Here we examine this assumption more closely. We include the constraint of unit variance of neuronal activity, used in most studies, into the objective functions. We then analyze learning on a database of natural (cat-cam) visual stimuli. We show that the effective objective functions are largely dominated by the constraint, and are therefore very similar. The resulting receptive fields show some similarities but also qualitative differences. Even for coefficient values for which the objective functions are dissimilar, the distributions of coefficients are similar and do not match the priors of the assumed generative model. In conclusion, the specific choice of the sparse prior is relevant, as is the choice of additional constraints, such as normalization of variance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号