首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When listening to a speaker, we need to adapt to her individual speaking characteristics, such as error proneness, accent, etc. The present study investigated two aspects of adaptation to speaker identity during processing spoken sentences in multi-speaker situations: the effect of speaker sequence across sentences and the effect of learning speaker-specific error probability. Spoken sentences were presented, cued, and accompanied by one of three portraits that were labeled as the speakers’ faces. In Block 1 speaker-specific probabilities of syntax errors were 10%, 50%, or 90%; in Block 2 they were uniformly 50%. In both blocks, speech errors elicited P600 effects in the scalp recorded ERP. We found a speaker sequence effect only in Block 1: the P600 to target words was larger after speaker switches than after speaker repetitions, independent of sentence correctness. In Block 1, listeners showed higher accuracy in judging sentence correctness spoken by speakers with lower error proportions. No speaker-specific differences in target word P600 and accuracy were found in Block 2. When speakers differ in error proneness, listeners seem to flexibly adapt their speech processing for the upcoming sentence through attention reorientation and resource reallocation if the speaker is about to change, and through proactive maintenance of neural resources if the speaker remains the same.  相似文献   

2.
BackgroundPhysician use of computerized speech recognition (SR) technology has risen in recent years due to its ease of use and efficiency at the point of care. However, error rates between 10 and 23% have been observed, raising concern about the number of errors being entered into the permanent medical record, their impact on quality of care and medical liability that may arise. Our aim was to determine the incidence and types of SR errors introduced by this technology in the emergency department (ED).SettingLevel 1 emergency department with 42,000 visits/year in a tertiary academic teaching hospital.MethodsA random sample of 100 notes dictated by attending emergency physicians (EPs) using SR software was collected from the ED electronic health record between January and June 2012. Two board-certified EPs annotated the notes and conducted error analysis independently. An existing classification schema was adopted to classify errors into eight errors types. Critical errors deemed to potentially impact patient care were identified.ResultsThere were 128 errors in total or 1.3 errors per note, and 14.8% (n = 19) errors were judged to be critical. 71% of notes contained errors, and 15% contained one or more critical errors. Annunciation errors were the highest at 53.9% (n = 69), followed by deletions at 18.0% (n = 23) and added words at 11.7% (n = 15). Nonsense errors, homonyms and spelling errors were present in 10.9% (n = 14), 4.7% (n = 6), and 0.8% (n = 1) of notes, respectively. There were no suffix or dictionary errors. Inter-annotator agreement was 97.8%.ConclusionsThis is the first estimate at classifying speech recognition errors in dictated emergency department notes. Speech recognition errors occur commonly with annunciation errors being the most frequent. Error rates were comparable if not lower than previous studies. 15% of errors were deemed critical, potentially leading to miscommunication that could affect patient care.  相似文献   

3.
The purpose of this study is to ascertain the error rates of using a voice recognition (VR) dictation system. We compared our results with several other articles and discussed the pros and cons of using such a system. The study was performed at the Southern Health Department of Diagnostic Imaging, Melbourne, Victoria using the GE RIS with Powerscribe 3.5 VR system. Fifty random finalized reports from 19 radiologists obtained between June 2008 and November 2008 were scrutinized for errors in six categories namely, wrong word substitution, deletion, punctuation, other, and nonsense phrase. Reports were also divided into two categories: computer radiography (CR = plain film) and non-CR (ultrasound, computed tomography, magnetic resonance imaging, nuclear medicine, and angiographic examinations). Errors were divided into two categories, significant but not likely to alter patient management and very significant with the meaning of the report affected, thus potentially affecting patient management (nonsense phrase). Three hundred seventy-nine finalized CR reports and 631 non-CR finalized reports were examined. Eleven percent of the reports in the CR group had errors. Two percent of these reports contained nonsense phrases. Thirty-six percent of the reports in the non-CR group had errors and out of these, 5% contained nonsense phrases. VR dictation system is like a double-edged sword. Whilst there are many benefits, there are also many pitfalls. We hope that raising the awareness of the error rates will help in our efforts to reduce error rates and strike a balance between quality and speed of reports generated.  相似文献   

4.
The aim of this study was to analyze retrospectively the influence of different acoustic and language models in order to determine the most important effects to the clinical performance of an Estonian language-based non-commercial radiology-oriented automatic speech recognition (ASR) system. An ASR system was developed for Estonian language in radiology domain by utilizing open-source software components (Kaldi toolkit, Thrax). The ASR system was trained with the real radiology text reports and dictations collected during development phases. The final version of the ASR system was tested by 11 radiologists who dictated 219 reports in total, in spontaneous manner in a real clinical environment. The audio files collected in the final phase were used to measure the performance of different versions of the ASR system retrospectively. ASR system versions were evaluated by word error rate (WER) for each speaker and modality and by WER difference for the first and the last version of the ASR system. Total average WER for the final version throughout all material was improved from 18.4% of the first version (v1) to 5.8% of the last (v8) version which corresponds to relative improvement of 68.5%. WER improvement was strongly related to modality and radiologist. In summary, the performance of the final ASR system version was close to optimal, delivering similar results to all modalities and being independent on user, the complexity of the radiology reports, user experience, and speech characteristics.  相似文献   

5.
Speech recognition systems have become increasingly popular as a means to produce radiology reports, for reasons both of efficiency and of cost. However, the suboptimal recognition accuracy of these systems can affect the productivity of the radiologists creating the text reports. We analyzed a database of over two million de-identified radiology reports to determine the strongest determinants of word frequency. Our results showed that body site and imaging modality had a similar influence on the frequency of words and of three-word phrases as did the identity of the speaker. These findings suggest that the accuracy of speech recognition systems could be significantly enhanced by further tailoring their language models to body site and imaging modality, which are readily available at the time of report creation.  相似文献   

6.
Speech error data and empirical studies suggest that the scope of planning is larger for semantic than for phonological form representations in speech production. Previous results have demonstrated that some patients show dissociable impairments in the retention of semantic and phonological codes. The effect of these STM deficits on speech production was investigated using a phrase production paradigm that manipulated the semantic relatedness of the words in the phrase. Subjects produced a conjoined noun phrase to describe two pictures (e.g., "ball and hat") or produced the same phrases in response to pairs of written words. For the picture naming condition, control subjects showed an interference effect for semantically related pictures relative to unrelated pictures. This interference effect was greatly exaggerated for two patients with semantic short-term memory deficits but not for a patient with a phonological STM deficit. For the written words, control subjects showed a small facilitatory effect for the onset of phrases containing semantically related words. One of the patients with a semantic STM deficit who was tested on picture naming was also tested on these materials and showed a small facilitatory effect within the range of controls. The findings support the contention that speech planning is carried out at a phrasal level at the lexical-semantic level and that the capacities that support semantic retention in list recall support speech production planning.  相似文献   

7.
ObjectivePatients communicate with healthcare providers via secure messaging in patient portals. As patient portal adoption increases, growing messaging volumes may overwhelm providers. Prior research has demonstrated promise in automating classification of patient portal messages into communication types to support message triage or answering. This paper examines if using semantic features and word context improves portal message classification.Materials and methodsPortal messages were classified into the following categories: informational, medical, social, and logistical. We constructed features from portal messages including bag of words, bag of phrases, graph representations, and word embeddings. We trained one-versus-all random forest and logistic regression classifiers, and convolutional neural network (CNN) with a softmax output. We evaluated each classifier’s performance using Area Under the Curve (AUC).ResultsRepresenting the messages using bag of words, the random forest detected informational, medical, social, and logistical communications in patient portal messages with AUCs: 0.803, 0.884, 0.828, and 0.928, respectively. Graph representations of messages outperformed simpler features with AUCs: 0.837, 0.914, 0.846, 0.884 for informational, medical, social, and logistical communication, respectively. Representing words with Word2Vec embeddings, and mapping features using a CNN had the best performance with AUCs: 0.908 for informational, 0.917 for medical, 0.935 for social, and 0.943 for logistical categories.Discussion and conclusionWord2Vec and graph representations improved the accuracy of classifying portal messages compared to features that lacked semantic information such as bag of words, and bag of phrases. Furthermore, using Word2Vec along with a CNN model, which provide a higher order representation, improved the classification of portal messages.  相似文献   

8.
We report the case of a patient suffering from a severe neologistic jargon sparing number words. Neologisms resulted from pervasive phoneme substitutions with frequent preservation of the overall syllabic structure (e.g. /revolver/ /reveltil/). Word and nonword reading, as well as picture naming, were equally affected. No significant influence of frequency, imageability, and grammatical class was found. In striking contrast with this severe speech impairment, the patient made virtually no phonological errors when reading aloud arabic or spelled-out numerals, but made frequentword selection errors (e.g. 250 "four hundred and sixty"). This observation indicates that during speech planning, different categories of words are processed by separable brain systems down to the level of phoneme selection, a more peripheral level than was previously assumed. Number words may be singled out during phonological processing either because they constitute a particular semantic category, or because they benefit from special brain mechanisms devoted to the production of ''automatic speech", or because they are the elementary building blocks of speech during the production of complex numerals.  相似文献   

9.
Abstract

Speech error data and empirical studies suggest that the scope of planning is larger for semantic than for phonological form representations in speech production. Previous results have demonstrated that some patients show dissociable impairments in the retention of semantic and phonological codes. The effect of these STM deficits on speech production was investigated using a phrase production paradigm that manipulated the semantic relatedness of the words in the phrase. Subjects produced a conjoined noun phrase to describe two pictures (e.g., "ball and hat") or produced the same phrases in response to pairs of written words. For the picture naming condition, control subjects showed an interference effect for semantically related pictures relative to unrelated pictures. This interference effect was greatly exaggerated for two patients with semantic short-term memory deficits but not for a patient with a phonological STM deficit. For the written words, control subjects showed a small facilitatory effect for the onset of phrases containing semantically related words. One of the patients with a semantic STM deficit who was tested on picture naming was also tested on these materials and showed a small facilitatory effect within the range of controls. The findings support the contention that speech planning is carried out at a phrasal level at the lexical-semantic level and that the capacities that support semantic retention in list recall support speech production planning.  相似文献   

10.
PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. METHODS: We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. RESULTS: Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. CONCLUSIONS: Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese.  相似文献   

11.
BACKGROUND: Many types of medical errors occur in and outside of hospitals, some of which have very serious consequences and increase cost. Identifying errors is a critical step for managing and preventing them. In this study, we assessed the explicit reporting of medical errors in the electronic record. METHOD: We used five search terms "mistake," "error," "incorrect," "inadvertent," and "iatrogenic" to survey several sets of narrative reports including discharge summaries, sign-out notes, and outpatient notes from 1991 to 2000. We manually reviewed all the positive cases and identified them based on the reporting of physicians. RESULT: We identified 222 explicitly reported medical errors. The positive predictive value varied with different keywords. In general, the positive predictive value for each keyword was low, ranging from 3.4 to 24.4%. Therapeutic-related errors were the most common reported errors and these reported therapeutic-related errors were mainly medication errors. CONCLUSION: Keyword searches combined with manual review indicated some medical errors that were reported in medical records. It had a low sensitivity and a moderate positive predictive value, which varied by search term. Physicians were most likely to record errors in the Hospital Course and History of Present Illness sections of discharge summaries. The reported errors in medical records covered a broad range and were related to several types of care providers as well as non-health care professionals.  相似文献   

12.
We tested the hypothesis that in spatial stimulus-response-compatibility (SRC) tasks two different error types occur: A noise-induced 'general error' independent of SRC and reaction time and a 'position driven error' in incompatible trials with short RT being driven by the irrelevant stimulus position. A second issue was whether error detection is different for these two types of errors, which should be reflected by differences in the error negativity (Ne), since the Ne is seen as a neural correlate of error detection. To study these issues, we used a Simon- and a spatial Stroop-task. In incompatible (vs. compatible) trials we found more errors and a below chance accuracy in fast responses. Neither the amplitude nor the latency of the Ne were significantly affected by the experimental factors. This pattern of behavioural results supports the above hypothesis of two error types in such tasks. The Ne results indicate that error detection is similar for both types of errors.  相似文献   

13.
The aims of this work were to measure the accuracy of one continuous speech recognition product and dependence on the speaker's gender and status as a native or nonnative English speaker, and evaluate the product's potential for routine use in transcribing radiology reports. IBM MedSpeak/Radiology software, version 1.1 was evaluated by 6 speakers. Two were nonnative English speakers, and 3 were men. Each speaker dictated a set of 12 reports. The reports included neurologic and body imaging examinations performed with 6 different modalities. The dictated and original report texts were compared, and error rates for overall, significant, and subtle significant errors were computed. Error rate dependence on modality, native English speaker status, and gender were evaluated by performing ttests. The overall error rate was 10.3 +/- 3.3%. No difference in accuracy between men and women was found; however, significant differences were seen for overall and significant errors when comparing native and nonnative English speakers (P = .009 and P = .008, respectively). The speech recognition software is approximately 90% accurate, and while practical implementation issues (rather than accuracy) currently limit routine use of this product throughout a radiology practice, application in niche areas such as the emergency room currently is being pursued. This methodology provides a convenient way to compare the initial accuracy of different speech recognition products, and changes in accuracy over time, in a detailed and sensitive manner.  相似文献   

14.
The aims of this work were to measure the accuracy of one continuous speech recognition product and dependence on the speaker's gender and status as a native or nonnative English speaker, and evaluate the product's potential for routine use in transcribing radiology reports. IBM MedSpeak/Radiology software, version 1.1 was evaluated by 6 speakers. Two were nonnative English speakers, and 3 were men. Each speaker dictated a set of 12 reports. The reports included neurologic and body imaging examinations performed with 6 different modalities. The dictated and original report texts were compared, and error rates for overall, significant, and subtle significant errors were computed. Error rate dependence on modality, native English speaker status, and gender were evaluated by performing ttests. The overall error rate was 10.3 +/- 3.3%. No difference in accuracy between men and women was found; however, significant differences were seen for overall and significant errors when comparing native and nonnative English speakers (P = .009 and P = .008, respectively). The speech recognition software is approximately 90% accurate, and while practical implementation issues (rather than accuracy) currently limit routine use of this product throughout a radiology practice, application in niche areas such as the emergency room currently is being pursued. This methodology provides a convenient way to compare the initial accuracy of different speech recognition products, and changes in accuracy over time, in a detailed and sensitive manner.  相似文献   

15.
We have recently provided evidence that an error-related negativity (ERN), an ERP component generated within medial-frontal cortex, is elicited by errors made during the performance of a continuous tracking task (O.E. Krigolson & C.B. Holroyd, 2006). In the present study we conducted two experiments to investigate the ability of the medial-frontal error system to evaluate predictive error information. In two experiments participants used a joystick to perform a computer-based continuous tracking task in which some tracking errors were inevitable. In both experiments, half of these errors were preceded by a predictive cue. The results of both experiments indicated that an ERN-like waveform was elicited by tracking errors. Furthermore, in both experiments the predicted error waveforms had an earlier peak latency than the unpredicted error waveforms. These results demonstrate that the medial-frontal error system can evaluate predictive error information.  相似文献   

16.
17.
Radiology report errors occur for many reasons including the use of pre-filled report templates, wrong-word substitution, nonsensical phrases, and missing words. Reports may also contain clinical errors that are not specific to the speech recognition including wrong laterality and gender-specific discrepancies. Our goal was to create a custom algorithm to detect potential gender and laterality mismatch errors and to notify the interpreting radiologists for rapid correction. A JavaScript algorithm was devised to flag gender and laterality mismatch errors by searching the text of the report for keywords and comparing them to parameters within the study’s HL7 metadata (i.e., procedure type, patient sex). The error detection algorithm was retrospectively applied to 82,353 reports 4 months prior to its development and then prospectively to 309,304 reports 15 months after implementation. Flagged reports were reviewed individually by two radiologists for a true gender or laterality error and to determine if the errors were ultimately corrected. There was significant improvement in the number of flagged reports (pre, 198/82,353 [0.24 %]; post, 628/309,304 [0.20 %]; P = 0.04) and reports containing confirmed gender or laterality errors (pre, 116/82,353 [0.014 %]; post, 285/309,304 [0.09 %]; P < 0.0001) after implementing our error notification system. The number of flagged reports containing an error that were ultimately corrected improved dramatically after implementing the notification system (pre, 17/116 [15 %]; post, 239/285 [84 %]; P < 0.0001). We developed a successful automated tool for detecting and notifying radiologists of potential gender and laterality errors, allowing for rapid report correction and reducing the overall rate of report errors.  相似文献   

18.
19.
We investigated whether speakers can use an internal channel to monitor their speech for taboo utterances and prevent these from being spoken aloud. Therefore event-related potentials were measured while participants carried out the SLIP task. In this task, speech errors were elicited that could either result in taboo words (taboo-eliciting trials) or neutral words (neutral-eliciting trials). In taboo-eliciting trials, there was an augmented negative wave around 600 ms after the pronunciation cue even though there were no overt errors. This component has previously been interpreted as reflecting conflict. These results indicate that taboo utterances can indeed be detected and corrected internally.  相似文献   

20.
Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号