首页 | 本学科首页   官方微博 | 高级检索  
     


Short utterance recognition using a network with minimum training
Authors:M. Daniel Tom and M. Fernando Tenorio
Affiliation:

Purdue University, USA

Abstract:
A feedforward network is used to recognize short, digitized, isolated utterances. A high, multispeaker recognition rate is achieved with a small vocabulary with a single training utterance. This approach makes use of the pattern recognition property of the network architecture to classify different temporal patterns in the multidimensional feature space. The network recognizes the utterances without the need of segmentation, phoneme identification, or time alignment. We train the network with four words spoken by one single speaker. The network is then able to recognize 20 tokens spoken by 5 other speakers. We repeat the above training and testing procedure using a different speaker's utterances for training each time. The overall accuracy is 97.5%. We compare this approach to the traditional dynamic programming (DP) approach, and find that DP with slope constraints of 0 and 1 achieve 98.5% and 85% accuracies respectively. Finally we validate out statistics by training and testing the network of a four-word subset of the Texas Instruments (Tl) isolated word database. The accuracy with this vocabulary exceeds 96%. By doubling the size of the training set, the accuracy is raised to 98%. Using a suitable threshold, we are able to raise the accuracy of one network from 87% to 98.5%. Thresholding applied to all networks would then raise the overall accuracy to well over 99%.

This technique is especially promising because of the low overhead and computational requirements, which make it suitable for a low cost, portable, command recognition type of application.

Keywords:Short utterance recognition   Minimum training   Multispeaker testing   Spatiotemporal patterns   Linear prediction coding   Generalization
本文献已被 ScienceDirect 等数据库收录!
相似文献(共20条):
[1]、Mark D Skowronski,John G Harris.Automatic speech recognition using a predictive echo state network classifier.[J].Neural networks,2007,20(3):414-423.
[2]、Powell J,Letson S,Davidoff J,Valentine T,Greenwood R.Enhancement of face recognition learning in patients with brain injury using three cognitive training procedures[J].Neuropsychological rehabilitation,2008,18(2):182-203.
[3]、Karen McKenzie,Edith Matheson,Kerry McKaskie,Lucie Hamilton,George C. Murray.Impact of group training on emotion recognition in individuals with a learning disability[J].British Journal of Learning Disabilities,2000,28(4):143-147.
[4]、Kiymik MK,Akin M,Subasi A.Automatic recognition of alertness level by using wavelet transform and artificial neural network[J].Journal of neuroscience methods,2004,139(2):231-240.
[5]、J Sipos,J T?gert.Short procedure for the recognition of aphasic disorders[J].Der Nervenarzt,1972,43(4):207-211.
[6]、Frommann N,Streit M,Wölwer W.Remediation of facial affect recognition impairments in patients with schizophrenia: a new training program[J].Psychiatry research,2003,117(3):281-284.
[7]、Eun Jin Kim and Yillbyung Lee.Handwritten Hangul recognition using a modified neocognitron[J].Neural networks,1991,4(6):743-750.
[8]、Anastasios MaronidisDimitris Bolis,Anastasios Tefas,Ioannis Pitas.Improving subspace learning for facial expression recognition using person dependent and geometrically enriched training sets[J].Neural networks,2011,24(8):814-823.
[9]、Martin C,Gervais R,Messaoudi B,Ravel N.Learning-induced oscillatory activities correlated to odour recognition: a network activity[J].The European journal of neuroscience,2006,23(7):1801-1810.
[10]、Learning invariant object recognition from temporal correlation in a hierarchical network[J].Neural networks
[11]、Radford K,Lah S,Thayer Z,Say MJ,Miller LA.Improving memory in outpatients with neurological disorders using a group-based training program[J].Journal of the International Neuropsychological Society,2012,18(4):738-748.
[12]、D M MacKay.On-line source-density computation with a minimum of electrodes[J].Electroencephalography and clinical neurophysiology,1983,56(6):696-698.
[13]、Spiros V Ioannou,Amaryllis T Raouzaiou,Vasilis A Tzouvaras,Theofilos P Mailis,Kostas C Karpouzis,Stefanos D Kollias.Emotion recognition through facial expression analysis based on a neurofuzzy network.[J].Neural networks,2005,18(4):423-435.
[14]、P Guedes de Oliveira,C Queiroz,F Lopes da Silva.Spike detection based on a pattern recognition approach using a microcomputer[J].Electroencephalography and clinical neurophysiology,1983,56(1):97-103.
[15]、R Wang.A hybrid learning network for shift-invariant recognition.[J].Neural networks,2001,14(8):1061-1073.
[16]、Buchanan RJ,Wang S,Huang C,Simpson P,Manyam BV.Analyses of nursing home residents with Parkinson\'s disease using the minimum data set[J].Parkinsonism & related disorders,2002,8(5):369-380.
[17]、X Blaizot,B Landeau,J C Baron,C Chavoix.Mapping the visual recognition memory network with PET in the behaving baboon.[J].Journal of cerebral blood flow and metabolism,2000,20(2):213-219.
[18]、Buchanan RJ,Wang S,Huang C,Graber D.Profiles of nursing home residents with multiple sclerosis using the minimum data set[J].Multiple sclerosis (Houndmills, Basingstoke, England),2001,7(3):189-200.
[19]、Bisagno V,Ferguson D,Luine VN.Short toxic methamphetamine schedule impairs object recognition task in male rats[J].Brain research,2002,940(1-2):95-101.
[20]、Chuangyin Dang,Yabin SunYuping Wang,Yang Yang.A deterministic annealing algorithm for the minimum concave cost network flow problem[J].Neural networks,2011,24(7):699-708.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号