首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
For clinical assessment as well as student training, there is a need for information pertaining to the perceptual dimensions of dysphonic voice. To this end, 24 naive listeners judged the similarity of 10 female and 10 male vowel samples, selected from within a narrow range of fundamental frequencies. Most of the perceptual variance for both sets of voices was associated with "degree of abnormality" as reflected by perceptual ratings as well as combined acoustic measures, based upon filtered and unfiltered signals. A second perceptual dimension for female voices was associated with high frequency noise as reflected by two acoustic measures: breathiness index (BRI) and a high-frequency power ratio. A second perceptual dimension for male voices was associated with a breathy-overtight continuum as reflected by period deviation (PDdev) and perceptual ratings of breathiness. Results are discussed in terms of perceptual training and the clinical assessment of pathological voices.  相似文献   

2.
Listener experience and perception of voice quality   总被引:9,自引:0,他引:9  
Five speech-language clinicians and 5 naive listeners rated the similarity of pairs of normal and dysphonic voices. Multidimensional scaling was used to determine the voice characteristics that were perceptually important for each voice set and listener group. Solution spaces were compared to determine if clinical experience affects perceptual strategies. Naive and expert listeners attended to different aspects of voice quality when judging the similarity of voices, for both normal and pathological voices. All naive listeners used similar perceptual strategies; however, individual clinicians differed substantially in the parameters they considered important when judging similarity. These differences were large enough to suggest that care must be taken when using data averaged across clinicians, because averaging obscures important aspects of an individual's perceptual behavior.  相似文献   

3.
Listeners judged the dissimilarity of pairs of synthesized nasal voices that varied on 3 dimensions. Separate nonmetric multidimensional scaling (MDS) solutions were calculated for each listener and the group. Similar 3-dimensional solutions were derived for the group and each of the listeners, with the group MDS solution accounting for 83% of the total variance in listeners' judgments. Dimension 1 ("Nasality") accounted for 54% of the variance, Dimension 2 ("Loudness") for 18% of the variance, and Dimension 3 ("Pitch") for 11% of the variance. The 3 dimensions were significantly and positively correlated with objective measures of nasalization, intensity, and fundamental frequency. The results of this experiment are discussed in relation to other MDS studies of voice perception, and there is a discussion of methodological issues for future research.  相似文献   

4.
The present study was conducted to investigate voice quality in tracheoesophageal speech by means of perceptual evaluations and to develop a clinically useful subset of perceptual scales sufficient for these perceptual evaluations. The perceptual ratings were obtained from both naive and trained raters (speech-language pathologists [SLPs]) after listening to a read-aloud text. The perceptual evaluations were performed by means of 19 semantic bipolar 7-point scales for the naive raters and 20 semantic bipolar 7-point scales for the trained raters. The trained raters were also asked to judge the overall voice quality as good, reasonable, or poor. Both naive listeners and trained SLPs were able to perform reliable perceptual judgments. Naive raters judged the tracheoesophageal voice as more deviant than the trained raters did. Naive raters made judgments based on 2 underlying perceptual dimensions (voice quality and pitch), whereas the trained raters made judgments based on 4 underlying perceptual dimensions (voice quality, tonicity, pitch, and tempo). These perceptual dimensions were further subdivided into a subset of 4 perceptual scales for the naive raters and a subset of 8 perceptual scales for the trained raters. This appeared to provide a sufficient coverage of the underlying perceptual dimensions used by the listeners.  相似文献   

5.
The literature has noted that speakers often perceive their own speaking pitch levels differently than listeners perceive them. However, little information is available regarding the specific characteristics of such perceptual differences. Speaking pitch level self-perception was explored in a group of 11 young adult males who served both as talkers and listeners. As a talker, each subject judged his own speaking pitch level in the process of speaking (live judgments) and during taped replay (taped judgments). The subjects' self-rankings in these two tasks and the rank order of taped voices as judged by listeners were compared to fundamental frequency rankings for the voices. The results indicated that the subjects judged their own taped voices in the same way that the listeners judged them, and the judgments corresponded to fundamental frequency rankings. During the live judgments, the subjects avoided extreme self-rankings, preferring to rank themselves closer to an average pitch level. The findings may have clinical significance in the remediation of certain voice disorders.  相似文献   

6.
Zur auditiven Bewertung der Stimmqualität   总被引:2,自引:0,他引:2  
Ptok M  Schwemmle C  Iven C  Jessen M  Nawka T 《HNO》2006,54(10):793-802
BACKGROUND: For routine clinical purposes, dysphonic voices are assessed perceptually using the GRBAS scale or analogues. For clinical application, the crucial question is the interrater reliability (IRR) of the auditory perceptual assessment of voice quality. Therefore, the IRR of the four point RBH (roughness, breathiness, hoarseness vs overall grade) scale was studied. Other parameters, e.g. validity and intrarater reliability were not considered. METHODS: A total of 78 patients read a standard text "Der Nordwind und die Sonne". These samples were evaluated by 19 speech and voice therapy students according to the degree of roughness, breathiness and hoarseness. Data were subjected to reliability analysis. RESULTS: Our data indicate a high IRR with a Cronbach's alpha of 0.94. No single rating of the 19 raters could be omitted without decreasing the IRR. DISCUSSION: The data indicate that the perceptual assessment of hoarseness for running speech is highly reliable. The application of the RBH scale is suitable for clinical purposes. It should be considered as an outcome measure.  相似文献   

7.
Eadie TL  Doyle PC 《The Laryngoscope》2004,114(4):753-759
Objectives/Hypothesis The purposes of the study were to determine listeners' auditory‐perceptual ratings of tracheoesophageal speakers, to determine quality of life in tracheoesophageal speakers, and to determine the potential relationship between listeners' ratings of speech and tracheoesophageal speakers' self‐rated quality of life. Study Design Twenty‐eight laryngectomized individuals who used tracheoesophageal speech as their primary mode of communication were studied. Fifteen naïve listeners provided auditory‐perceptual ratings. Methods Twenty‐eight tracheoesophageal speakers (22 men and 6 women) completed a general information form, in addition to the University of Michigan Head and Neck Quality of Life (HNQOL) instrument; speakers also provided connected speech samples of a standard passage. Fifteen naïve listeners evaluated the tracheoesophageal speech samples for overall speech severity, naturalness, acceptability, and pleasantness using direct magnitude estimation procedures. Results Listeners were able to discriminate among tracheoesophageal speech samples relative to the auditory‐perceptual dimensions. Male tracheoesophageal speakers were judged as having significantly better, more acceptable, and more pleasant voices than women. Scores on the HNQOL instrument were determined to be higher among the group of tracheoesophageal speakers in the present study than those reported in previous studies. No significant differences were found among men and women for quality of life scores. Quality of life domains and auditory‐perceptual judgments of tracheoesophageal speech were moderately correlated. Conclusion Women who use tracheoesophageal speech may be differentially penalized for dimensions related to voice quality. Limitations in voice did not necessarily translate into worse overall quality of life, indicating that auditory‐perceptual evaluation and quality of life questionnaires are evaluating different aspects of function after laryngectomy.  相似文献   

8.
This study aimed to evaluate the reliability and sensitivity to change of three commonly used acoustic parameters as measured by the Multi-Dimensional Voice Programme (MDVP); jitter, shimmer and noise-to-harmonic ratio. A total of 231 subjects' voices were recorded and analysed. The sample comprised 145 dysphonic patients who received intervention (surgery or voice therapy), 36 dysphonic patients who received no intervention, and 50 non-dysphonic (normal) subjects. All voices were recorded and analysed on two occasions (before and after treatment, or test-retest assessment) using a standard procedure. These data were analysed using standard psychometric procedures for assessing reliability and responsiveness. The acoustic analysis measures demonstrated poor to moderate reliability and effect size with respect to their sensitivity to change. Caution should be exercised in the injudicious use of computer-based acoustic analysis systems as an isolated measure of voice outcome in any clinical trial of interventions aimed at improving voice quality.  相似文献   

9.
Non-organic dysphonia. II. Phonetograms for normal and pathological voices   总被引:1,自引:0,他引:1  
The clinical usefulness of the phonetogram, i.e. a graph showing the sound pressure level (SPL) of softest and loudest possible phonation over the entire fundamental frequency range of a voice, was investigated. Phonetograms of 29 female non-organic dysphonic patients, 17 healthy female subjects, 18 non-organic dysphonic male patients and 12 healthy male subjects were compared. The female patients showed significantly lower SPL values for loudest phonation when compared with healthy female subjects, while no significant difference was seen in the male subjects in this regard. With respect to the SPL values for softest phonation, on the other hand, the male dysphonic patients showed significantly higher SPL values than the healthy male subjects, whereas no significant difference was seen in the female subjects. Spectrum analysis showed that the patients had a more dominating fundamental in loud phonation than did the healthy voices.  相似文献   

10.
Spasmodic dysphonia voices form, in the same way as substitution voices, a particular category of dysphonia that seems not suited for a standardized basic multidimensional assessment protocol, like the one proposed by the European Laryngological Society. Thirty-three exhaustive analyses were performed on voices of 19 patients diagnosed with adductor spasmodic dysphonia (SD), before and after treatment with Botulinum toxin. The speech material consisted of 40 short sentences phonetically selected for constant voicing. Seven perceptual parameters (traditional and dedicated) were blindly rated by a panel of experienced clinicians. Nine acoustic measures (mainly based on voicing evidence and periodicity) were achieved by a special analysis program suited for strongly irregular signals and validated with synthesized deviant voices. Patients also filled in a VHI-questionnaire. Significant improvement is shown by all three approaches. The traditional GRB perceptual parameters appear to be adequate for these patients. Conversely, the special acoustic analysis program is successful in objectivating the improved regularity of vocal fold vibration: the basic jitter remains the most valuable parameter, when reliably quantified. The VHI is well suited for the voice-related quality of life. Nevertheless, when considering pre-therapy and post-therapy changes, the current study illustrates a complete lack of correlation between the perceptual, acoustic, and self-assessment dimensions. Assessment of SD-voices needs to be tridimensional.  相似文献   

11.
The vowel [a:] in a test word, judged normal or dysphonic, was examined with the Self-Organizing Map; the artificial neural network algorithm of Kohonen. The algorithm produces two-dimensional representations (maps) of speech. Input to the acoustic maps consisted of 15-component spectral vectors calculated at 9.83-msec intervals from short-time power spectra. The male and female maps were first calculated from the speech of healthy subjects and then the [a:] samples (15 successive spectral vectors) were examined on the maps. The dysphonic voices deviated from the norm both in the composition of the short-time power spectra (characterized by the dislocation of the trajectory pattern on the map) and in the stability of the spectrum during the performance (characterized by the pattern of the trajectory on the map). Rough voices were distinguished from breathy ones by their patterns on the map. With the limited speech material, an index for the degree of pathology could not be determined. A self-organized acoustic map provides an on-line visual representation of voice and speech in an easily understandable form. The method is thus suitable not only for diagnostic but also for educational and therapeutic purposes.  相似文献   

12.
Judgments of consonant similarity were obtained from subjects who had normal hearing, high-frequency sensorineural hearing loss, or relatively flat sensorineural hearing loss. The individual differences model through program INDSCAL was used to derive a set of perceptual features empirically from the similarity judgments, and to group the subjects on the basis of strength of feature usage. The analysis revealed that sonorance was the dominant dimension in the similarity judgments of the subjects with high-frequency hearing losses, while sibilance tended to dominate the judgments of the subjects with flat audiometric configurations. The normal-hearing subjects tended to weight these two dimensions approximately equally. These differences in similarity judgments were observed based upon audiometric configuration, despite the fact that the two hearing-impaired groups were not unique in word-recognition ability.  相似文献   

13.
The vocal quality of a patient is modeled by means of a Dysphonia Severity Index (DSI), which is designed to establish an objective and quantitative correlate of the perceived vocal quality. The DSI is based on the weighted combination of the following selected set of voice measurements: highest frequency (F(0)-High in Hz), lowest intensity (I-Low in dB), maximum phonation time (MPT in s), and jitter (%). The DSI is derived from a multivariate analysis of 387 subjects with the goal of describing, purely based on objective measures, the perceived voice quality. It is constructed as DSI = 0.13 x MPT + 0.0053 x F(0)-High - 0.26 x I-Low - 1.18 x Jitter (%) + 12.4. The DSI for perceptually normal voices equals +5 and for severely dysphonic voices -5. The more negative the patient's index, the worse is his or her vocal quality. As such, the DSI is especially useful to evaluate therapeutic evolution of dysphonic patients. Additionally, there is a high correlation between the DSI and the Voice Handicap Index score.  相似文献   

14.
Glottographic signals may be superior to acoustic signals for tracking glottal source perturbations, since supraglottal vocal tract effects on glottographic signals are relatively minimal compared with the acoustic signal as measured beyond the lips. This study compared the ability of differing signals to differentiate among normal voices and abnormal voices that were due to two categories of biomechanical disease. Acoustic, electroglottographic, and photoglottographic signals recorded during vowel phonation sustained by 26 normal subjects and 65 patients were measured for perturbations of frequency and amplitude. One-way analysis of variance (ANOVA) revealed that amplitude perturbation measures from photoglottographic signals significantly differentiated neuromuscular from mass lesion sources of dysphonia. Acoustic and electroglottographic signal perturbations differentiated between normal and abnormal voices but did not distinguish between the dysphonic characteristics of neuromuscular disorders and those of mass lesions of the vocal folds.  相似文献   

15.
The vocal pathology is very frequent among the teaching professionals. We have selected 140 teachers, 70 normals, and 70 with dysphonia. To all of them we have realised a complete voice exploration: aerodyinamic tests, tone and extension of the voice, perceptual evaluation of the voice with the GRBAS scale and also a videolaringostroboscopy to diagnose the teachers. Among the dysphonic teachers there is a poor neumophonologic coordination, a major use or an inadequate use of the respiratory and/or laryngeal musculature. The dysphonic teachers present a more pathological voice in the GRBAS scale than the normal teachers.  相似文献   

16.
PURPOSE: Investigate training-related changes in acoustic-phonetic representation of consonants produced by a text-to-speech (TTS) computer speech synthesizer. METHOD: Forty-eight adult listeners were trained to better recognize words produced by a TTS system. Nine additional untrained participants served as controls. Before and after training, participants were tested on consonant recognition and made pairwise judgments of consonant dissimilarity for subsequent multidimensional scaling (MDS) analysis. RESULTS: Word recognition training significantly improved performance on consonant identification, although listeners never received specific training on phoneme recognition. Data from 31 participants showing clear evidence of learning (improvement>or=10 percentage points) were further investigated using MDS and analysis of confusion matrices. Results show that training altered listeners' treatment of particular acoustic cues, resulting in both increased within-class similarity and between-class distinctiveness. Some changes were consistent with current models of perceptual learning, but others were not. CONCLUSION: Training caused listeners to interpret the acoustic properties of synthetic speech more like those of natural speech, in a manner consistent with a flexible-feature model of perceptual learning. Further research is necessary to refine these conclusions and to investigate their applicability to other training-related changes in intelligibility (e.g., associated with learning to better understand dysarthric speech or foreign accents).  相似文献   

17.
This research note describes the design and testing of a device for unobtrusive, long-term ambulatory monitoring of voice use, named the Portable Vocal Accumulator (PVA). The PVA contains a digital signal processor for analyzing input from a neck-placed miniature accelerometer. During its development, accelerometer recordings were obtained from 99 participants with normal or dysphonic voices. The recordings were used to (a) test the specifications and capabilities of the PVA for monitoring normal and dysphonic voices and (b) explore potentially useful displays for the large quantity of data generated by long-term monitoring. The current prototype PVA is pocket-sized (12 x 8.5 x 2 cm), lightweight (200 g), and capable of sampling 11 hr of voice-use data, including estimates of fundamental frequency, sound pressure level, and phonation duration.  相似文献   

18.
INTRODUCTION: A multidimensional protocol has been established by the ELS in order to reach better agreement and standardisation for functional assessment of pathologic voices. In order to evaluate the validity, practicability and applicability of this protocol the experiences of 6 european voice centres have been analysed in a retrospective study. MATERIAL AND METHODS: The ELS protocol comprises 5 dimensions: perceptual voice evaluation, videostroboscopy, acoustics, aerodynamics and subjective rating by the patient. Results obtained in 94 patients with benign voice disorders were evaluated retrospectively in a multicenter study. RESULTS: According to our results, the validity, practicability and applicability of the ELS protocol was largely satisfactory. This was true for all "common" voice disorders, but not for extreme voice alterations (e. g. spasmodic dysphonia, aphonia, substitution voices). The 5 dimension proofed to be not redundant and were able to selectively differentiate pre- post changes among various etiologies of voice disorders, various types of treatment and genders.  相似文献   

19.
If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.  相似文献   

20.
Twenty speech-language pathologists judged the adequacy of oral diadochokinetic performances by ten normal young adult speakers, ten normal geriatric speakers, and four dysarthric speakers (foils) for the purpose of investigating age-related changes in speech. Listeners rated each speaker according to 11 perceptual dimensions. Significant differences in ratings were found among the three subject groups for 10 of the 11 perceptual dimensions. The performances of elderly normal adult speakers were rated farther from the "normal" endpoint of a seven-point continuum than those of the young normal adults. The listeners also reported lesser degrees of confidence in their ratings of the geriatrics in comparison with both young adult and dysarthric groups. Perceptual characteristics associated with oral diadochokinetic performance appear to be altered with advanced age. Further analysis of clinicians' judgments suggest support for Ryan and Burk's (1974) proposal that the speech of aged adults may fall at the "mild end of a dysarthric continuum." Results emphasized the need for development of clinical standards of speech normality for the geriatric population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号