首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There are many studies investigating psychometric properties of the Braden scale, a scale that predicts the risk for pressure ulcers. The main focus of these studies is validity as opposed to reliability. In order to estimate the degree of interrater reliability a literature review revealed that numerous statistical approaches and coefficients were used (Pearson's product-moment correlation, Cohen's kappa, overall percentage of agreement, intraclass correlation). These coefficients were calculated for the individual items and the overall Braden score and were used inconsistently. The advantages and limitations of every coefficient are discussed and it is concluded that most of them are inappropriate measures. Therefore, estimating the degree of the Braden scale interrater reliability is limited to a certain extent. It is shown that the intraclass correlation coefficient is an appropriate statistical approach for calculating the interrater reliability of the Braden scale. It is recommended to present intraclass correlation coefficients in combination with the overall percentage of agreement.  相似文献   

2.

Background

Adequate risk assessment is essential in pressure ulcer prevention. Assessment scales were designed to support practitioners in identifying persons at pressure ulcer risk. The Braden scale is one of the most extensively studied risk assessment instruments, although the majority of studies focused on validity rather than reliability.

Objectives

The first aim was to measure the interrater reliability of the Braden scale and its individual items. The second aim was to study different statistical approaches regarding interrater reliability estimation.

Design and methods

An interrater reliability study was conducted in two German nursing homes. Residents (n = 152) from 8 units were assessed twice. The raters were trained nurses with a work experience ranging from 0.5 to 30 years. Data were analysed using an overall percentage of agreement, weighted and unweighted kappa and the intraclass correlation coefficient.

Results

Differences between nurses rating the overall Braden score ranged from 0 up to 9 points. Interrater reliability expressed by the intraclass correlation coefficient ranged from 0.73 (95% CI 0.26-0.91) to 0.95 (95% CI 0.87-0.98). Calculated intraclass correlation coefficients for individual items ranged from 0.06 (95% CI −0.31 to 0.48) to 0.97 (95% CI 0.93-0.99) with the lowest values being measured for the items “sensory perception” and “nutrition”. There was no association between work experience and the level of interrater reliability. With two exceptions, simple kappa-values were always lower than weighted kappa-values and intraclass correlation coefficients.

Conclusions

Although the calculated interrater reliability coefficients for the total Braden score were high in some cases, several clinically relevant differences occurred between the nurses. Due to interrater reliability being very low for the items “sensory perception” and “nutrition”, it is doubtful if their assessment contributes to any valid results. The calculation of weighted kappa or intraclass correlation coefficients is the most appropriate interrater reliability estimates.  相似文献   

3.
Haas U  Mayer H  Evers GC 《Pflege》2002,15(4):191-197
This study was conducted to examine the inter-rater reliability of the "Functional Independence Measure" (FIM). The FIM is an assessment to determine the functional independence of patients with disabilities. It consists of eighteen items to assess activities of daily living. The degree of independence is measured by a seven point ordinal scale. The inter-rater reliability was examined by a convenience sample of 128 assessments with the FIM. Fifteen nurses assessed thirty patients with brain injuries in a centre for rehabilitation. The design was correlational. The degree of agreement between the assessments was calculated by Cohen's Kappa coefficients. The Kappa coefficients of the assessments were between kappa = 0.56 and kappa = 0.78; the median of the Kappa coefficients is kappa = 0.65. This indicates a moderate to high inter-rater reliability of the FIM when used by nurses for the assessment of patients with head injuries.  相似文献   

4.

Background

The application of standardized pressure ulcer risk assessment scales is recommended in clinical practice.

Objectives

The aims of this study were to compare the interrater reliabilities of the Braden and Waterlow scores and subjective pressure ulcer risk assessment and to determine the construct validity of these three assessment approaches.

Design

Observational.

Settings

Two intensive care units of a large University Hospital in Germany.

Participants

21 and 24 patients were assessed by 53 nurses. Patients’ mean age was 69.7 (SD 8.3) and 67.2 (SD 11.3).

Methods

Two interrater reliability studies were conducted. Samples of patients were assessed independently by a sample of three nurses. A 10-cm visual analogue scale was applied to measure subjective pressure ulcer risk rating. Intraclass correlation coefficients (ICC) and standard errors of measurement (SEM) were used to determine interrater reliability and agreement of the item and sum scores. Pearson product moment correlation coefficients (r) were used to indicate the degree and direction of the relationships between the measures.

Results

The interrater reliability for the subjective pressure ulcer risk assessment was ICC(1,1) = 0.51 (95% CI 0.26-0.74) and 0.71 (95% CI 0.53-0.85). Interrater reliability of Braden scale sum scores was ICC(1,1) = 0.72 (95% CI 0.52-0.87) and 0.84 (95% CI 0.72-0.92) and for Waterlow scale sum scores ICC(1,1) = 0.36 (95% CI 0.09-0.63) and 0.51 (95% CI 0.27-0.72). The absolute degree of correlation between the measures ranged from 0.51 to 0.77.

Conclusions

Interrater reliability coefficients indicate a high degree of measurement error inherent in the scores. Compared to subjective risk assessment and the Waterlow scale scores the Braden scale performed best. However, measurement error is too high to draw valid inferences for individuals. Less than 26-59% of variances in scores of one scale were determined by scores of another scale indicating that all three instruments only partly measured the same construct. The use of the Braden-, Waterlow- and Visual Analogue scales for measuring pressure ulcer risk of intensive care unit patients is not recommended.  相似文献   

5.
6.
BACKGROUND: The reliability and validity of pressure ulcer diagnosis and grading are major methodological issues in studies and reports on pressure ulcer frequency. OBJECTIVES: The aim of the study was to estimate the reliability and validity of pressure ulcer diagnosis and grading within the interdisciplinary pressure ulcer project of the University Clinics of Essen, Germany. DESIGN: Fifty images of wounds from the foot/heel region and 50 images of wounds from the buttock/hip region were classified using a 4-grade scale. A gold standard was established by consensus of two senior physicians. SETTINGS: The images were assessed PC-based, independently by each rater. PARTICIPANTS: Five nursing experts and two physicians participated. METHODS: Mean simple Kappa and per cent agreement were calculated to assess reliability and validity. RESULTS: Mean simple Kappa values showed a moderate interrater agreement for grading and a fair interrater agreement for diagnosis. The percentage of agreements was highest for pressure ulcer diagnosis in the buttock/hip region with 90.5% and lowest for pressure ulcer grading in the buttock/hip region with 63.5%. No differences could be found between nurses and physicians. CONCLUSIONS: The differentiation between pressure ulcers and other skin lesions is rather difficult. It is important to assign the lower grade when the available information does not definitely support the higher grade. The level of agreement found was intermediate in the range of published results. A substantial level of agreement should be obtainable through further standardisation and training. Future studies should control for dependency in the assessment situation and dispense with the category "uncertain".  相似文献   

7.
Interrater and test-retest reliability of two pediatric balance tests   总被引:3,自引:0,他引:3  
The purpose of this study was to examine the interrater and test-retest reliability of a one-leg balance test and a tiltboard balance test. Twenty-four normally developing children aged 4 through 9 years participated in the study. Time and quality of balance on one leg and degrees of tilt on a tiltboard prior to postural adjustment were measured. Both tests were completed with eyes open and with eyes closed. Interrater reliability was examined using two raters. Test-retest reliability, with a one-week interval between test and retest, was examined for a subgroup consisting of 12 children. Spearman rank-order correlation coefficients were used as indexes of both interrater and test-retest reliability for time and degrees of tilt. To supplement the correlation coefficients, the magnitudes of difference between raters' scores and between test and retest scores were calculated. Spearman coefficients were moderate to high for one-leg balance when scores for both feet were combined for both eyes-open and eyes-closed conditions. The magnitude of difference between scores was low, indicating good agreement between raters and across time. Interrater and test-retest reliabilities of quality of one-leg standing balance were examined by calculating percentages of agreement and Cohen's Kappa statistics. Results of these analyses revealed the need for further study. The Spearman coefficients for the interrater tiltboard test were high; however, the test-retest coefficients were low. The magnitudes of difference between scores were small for the two raters, but large for test and retest. These results are important to consider when using these tests for initial evaluation or for monitoring patient progress.  相似文献   

8.
Reliability of the dynamic gait index in people with vestibular disorders   总被引:1,自引:0,他引:1  
OBJECTIVE: To examine the interrater reliability of the Dynamic Gait Index (DGI) when used with patients with vestibular disorders and with previously published instructions. DESIGN: Correlational study. SETTING: Outpatient physical therapy clinic. PARTICIPANTS: Subjects included 30 patients (age range, 27-88y) with vestibular disorders, who were referred for vestibular rehabilitation. INTERVENTIONS: Subjects' performance on the DGI was concurrently rated by 2 physical therapists experienced in vestibular rehabilitation to determine interrater reliability. MAIN OUTCOME MEASURES: Percentage agreement, kappa statistics, and the ratio of subject variability to total variability were calculated for individual DGI items. Kappa statistics for individual items were averaged to yield a composite kappa score of the DGI. Total DGI scores were evaluated for interrater reliability by using the Spearman rank-order correlation coefficient. RESULTS: Interrater reliability of individual DGI items varied from poor to excellent based on kappa values (kappa range,.35-1.00). Composite kappa values showed good overall interrater reliability (kappa=.64) of total DGI scores. The Spearman rho demonstrated excellent correlation (r=.95) between total DGI scores given concurrently by the 2 raters. CONCLUSION: DGI total scores, administered by using the published instructions, showed moderate interrater reliability with subjects with vestibular disorders. The DGI should be used with caution in this population at this time, because of the lack of strong reliability.  相似文献   

9.

Purpose:

The GEM scale is an objective assessment tool, specifically developed for older adults, to evaluate walking safety using standardized tasks. The purpose of this study was to estimate the interrater and test–retest agreement of the GEM scale.

Method:

Participants (n = 41; ≥ 65 years) were recruited from geriatric units and assessed independently and simultaneously by three raters on two occasions using the GEM scale. Kappa coefficients and percentage agreement were calculated for each item of the scale.

Results:

A majority of walking items (n = 22) showed fair to substantial interrater agreement (κ ≥ 0.25) and substantial to almost perfect test–retest agreement (κ ≥ 0.60). Mean percentage agreement was high for both interrater and test–retest agreement (79% ± 15% and 83% ± 16% respectively). Moreover, detailed analyses demonstrated that the relatively low agreement of some items resulted from changes in the performance of some participants and the low variability of scores. Although some walking items showed less agreement, the final decision regarding the participants’ ability to walk safely resulted in moderate to substantial interrater and test–retest agreement.

Conclusion:

The GEM scale is a new assessment tool that can now be used with estimated interrater and test–retest properties to allow therapists to objectively evaluate walking safety among the elderly.  相似文献   

10.
RATIONALE AND AIMS: 'OTseeker' is an online database of randomized controlled trials (RCTs) and systematic reviews relevant to occupational therapy. RCTs are critically appraised and rated for quality using the 'PEDro' scale. We aimed to investigate the inter-rater reliability of the PEDro scale before and after revising rating guidelines. METHODS: In study 1, five raters scored 100 RCTs using the original PEDro scale guidelines. In study 2, two raters scored 40 different RCTs using revised guidelines. All RCTs were randomly selected from the OTseeker database. Reliability was calculated using Kappa and intraclass correlation coefficients [ICC (model 2,1)]. RESULTS: Inter-rater reliability was 'good to excellent' in the first study (Kappas >or= 0.53; ICCs >or= 0.71). After revising the rating guidelines, the reliability levels were equivalent or higher to those previously obtained (Kappas >or= 0.53; ICCs >or= 0.89), except for the item, 'groups similar at baseline', which still had moderate reliability (Kappa = 0.53). In study 2, two PEDro scale items, which had their definitions revised, 'less than 15% dropout' and 'point measures and variability', showed higher reliability. In both studies, the PEDro items with the lowest reliability were 'groups similar at baseline' (Kappas = 0.53), 'less than 15% dropout' (Kappas 相似文献   

11.
The objective of this study is to evaluate the reliability and construct validity of an obstacle course assessment of wheelchair user performance (OCAWUP). Seventeen experienced wheelchair users using three different propulsion methods (two hands, one hand and one foot or motorized wheelchair) were assessed twice on the 10 obstacles of the OCAWUP. To evaluate reliability, time (in seconds) and degree of ease (DE) in overcoming obstacles (four-level scale) were assessed by three occupational therapists. Construct validity was assessed by verifying whether the OCAWUP's global score of ease (GSE) varied with wheelchair propulsion methods and the functional independence measure (FIM). Intraclass correlation coefficients calculated for reliability of time and GSE were up to 0.74 for test-retest reliability and up to 0.97 for interrater reliability. Cohen's kappa coefficients calculated for DE reliabilities varied from 0.09 to 1.0 with degrees of association up to 65%. A significant difference (P相似文献   

12.
Aims Among various risk assessment scales for the development of pressure ulcers in long‐term care residents that have been published in the last three decades, the Braden scale is among the most tested and applied tools. The sum score of the scale implies that all items are equally important. The aim of this study is to show whether specific items are of greater significance than others and therefore have a higher clinical relevance. Design Data analysis of six pressure ulcer prevalence studies (2004–2009). Methods A total of 17 666 residents (response rate 79.6%) in 234 long‐term care facilities participated in 6 annual point prevalence studies that were conducted from 2004 to 2009 throughout Germany. For the classification of the sample regarding pressure ulcers as a dependent variable and the Braden items as predictor variables, Chi‐square Automatic Interaction Detector (CHAID) for modelling classification trees has been used. Results Pressure ulcer prevalence was 5.4% including pressure ulcer grade 1 and 3.4% for pressure ulcer grades 2–4. CHAID analysis for the classification tree provided the item ‘friction and shear’ as the most important predictor for pressure ulcer prevalence. On the second level, the strongest predictors were ‘nutrition’ and ‘activity’ and on the third level they were ‘moisture’ and ‘mobility’. Residents with problems regarding ‘friction and shear’ and poor nutritional status present with an 18.0 (14.8) pressure ulcer prevalence which is 3–4 times higher than average. Conclusion CHAID analyses have shown that all items of the Braden scale are not equally important. For residents in long‐term care facilities in Germany, the existence of ‘friction and shear’ as a potential and especially as a manifest problem has had the strongest association with pressure ulcer prevalence.  相似文献   

13.

Background

The Waterlow scale is one of the pressure ulcer risk assessment scales which are frequently criticised for their low reliability. It is widely used in the United Kingdom, Europe and all over the world.

Objectives

The study objectives were to systematically review and evaluate inter- and intrarater reliability and/or agreement of the whole Waterlow scale and its single items. The overall aim was to find out if the Waterlow scale is applicable to daily clinical practice.

Design

Systematic review.

Data sources

MEDLINE (1985-June 2008), EMBASE (1985-June 2008), CINAHL (1985-June 2008) and World Wide Web.

Review methods

Selections of relevant studies, data extractions, recalculations of reliability and agreement coefficients, and study quality assessments were independently conducted by two researchers. Designs, methods and results of relevant studies were systematically described, compared and interpreted.

Results

Eight research reports were identified containing the results of nine inter- and intrarater reliability and agreement studies. Only three studies were considered as high quality studies. The Waterlow scale in clinical practice was examined in four studies. Interrater agreement for the total score varied between 0% and 57%. Taking into account any differences of up to two points the total score agreement increased to up to 86%. Median ranges of differences among raters scoring single items were high for ‘poor nutrition’, ‘skin type’, and ‘mobility’. Recalculated intrarater reliability for one researcher was ICC(2, 1) = 0.97 (95% C.I. 0.94-0.98).

Conclusions

Empirical evidence is rare regarding reliability and agreement among nurses when using the Waterlow scale in clinical practice. Interrater agreement for the total score is comparable to other pressure ulcer risk assessment scales. The interrater reliability has never been examined. Therefore, evaluation of reliability and agreement and evaluation of the applicability of the Waterlow scale to clinical practice are limited. It is very likely that the items ‘poor nutrition’, ‘mobility’, and ‘skin type’ are the most difficult items to rate.  相似文献   

14.
Boes C 《Pflege》2000,13(6):397-402
For more accurate and objective pressure sore risk assessment various risk assessment tools were developed mainly in the USA and Great Britain. The Braden Scale for Predicting Pressure Sore Risk is one such example. By means of a literature analysis of German and English texts referring to the Braden Scale the scientific control criteria reliability and validity will be traced and consequences for application of the scale in Germany will be demonstrated. Analysis of 4 reliability studies shows an exclusive focus on interrater reliability. Further, even though examination of 19 validity studies occurs in many different settings, such examination is limited to the criteria sensitivity and specificity (accuracy). The range of sensitivity and specificity level is 35-100%. The recommended cut off points rank in the field of 10 to 19 points. The studies prove to be not comparable with each other. Furthermore, distortions in these studies can be found which affect accuracy of the scale. The results of the here presented analysis show an insufficient proof for reliability and validity in the American studies. In Germany, the Braden scale has not yet been tested under scientific criteria. Such testing is needed before using the scale in different German settings. During the course of such testing, construction and study procedures of the American studies can be used as a basis as can the problems be identified in the analysis presented below.  相似文献   

15.

Background

The reproducibility of the Canadian Triage &; Acuity Scale (CTAS), designed and introduced in the late 1990s in all Canadian emergency departments (EDs), has been studied mostly using measures of interrater agreement. However, each of these studies shares a common limitation: the nurses had received fresh CTAS training, which is likely to have led to an overestimation of the reproducibility of CTAS.

Objectives

This study aims to assess the interrater reliability of the CTAS in current clinical practice, that is, as used by experienced ED nurses without recent certification or recertification.

Methods

A prospective sample of 100 patients arriving by ambulance was identified and yielded a set of 100 written scenarios. Five experienced ED nurses reviewed and blindly assigned a CTAS score to each scenario. The agreement among nurses was measured using the Kappa statistic calculated with quadratic weights. Kappa values were generated for each pair of nurses and a global Kappa coefficient was calculated to measure overall agreement.

Results

Overall interrater agreement was moderate, with a global Kappa of 0.44 (95% confidence interval 0.40–0.48). However, pairwise, Kappa values were heterogeneous (0.30 to 0.61, p = 0.0013).

Conclusions

The moderate interrater agreement observed in this study is disappointingly low and suggests that CTAS reliability may be lower than expected, and this warrants further research. Intra-observer reliability of CTAS should be ascertained more extensively among experienced nurses, and a future evaluation should involve several institutions.  相似文献   

16.
BACKGROUND AND PURPOSE: Functional mobility in people with advanced Parkinson disease, some of whom have a variable response to drug treatment, is often difficult to evaluate. The objectives of this study were to investigate the interrater reliability of measurements obtained with a scale designed to measure mobility and to determine the impact of self-rated dyskinesias and fluctuations on the measure. SSUBJECTS: Twenty-nine people with Parkinson disease and with disability and considerable disease duration (mean=11.7 years, SD=4.9, range=6-22) took part in the study. METHODS: The subjects' performance on a 10-item scale was videotaped. The videotapes were then scored by 2 independent raters, and the scores were used to determine interrater reliability. The stability of 6 repeated measurements was examined in the home situation, taking into account self-rated fluctuations of motor performance. RESULTS: Weighted Kappa values of agreement (.86-.98) confirmed the reliability between testers. Measurement during the "on" phase (when medication was working optimally) and the "off" phase (when the action of medication was strongly decreased or absent) led to different measurements. Measuring frequently within "on" and "off" phases gave relatively stable measurements for total function, bed transfers, and gait akinesia, the latter during the "off" phase only (intraclass correlation coefficients [ICCs]=.70-.93). However, more modest repeatability applied to transfers from a chair (ICC=.65-.67). CONCLUSION AND DISCUSSION: To ensure valid results in future effect studies, clinical differentiation between "on" and "off" phase measurements is proposed on the basis of patients' own perception of their medication status.  相似文献   

17.
BACKGROUND AND PURPOSE: Assessment of the quality of randomized controlled trials (RCTs) is common practice in systematic reviews. However, the reliability of data obtained with most quality assessment scales has not been established. This report describes 2 studies designed to investigate the reliability of data obtained with the Physiotherapy Evidence Database (PEDro) scale developed to rate the quality of RCTs evaluating physical therapist interventions. METHOD: In the first study, 11 raters independently rated 25 RCTs randomly selected from the PEDro database. In the second study, 2 raters rated 120 RCTs randomly selected from the PEDro database, and disagreements were resolved by a third rater; this generated a set of individual rater and consensus ratings. The process was repeated by independent raters to create a second set of individual and consensus ratings. Reliability of ratings of PEDro scale items was calculated using multirater kappas, and reliability of the total (summed) score was calculated using intraclass correlation coefficients (ICC [1,1]). RESULTS: The kappa value for each of the 11 items ranged from.36 to.80 for individual assessors and from.50 to.79 for consensus ratings generated by groups of 2 or 3 raters. The ICC for the total score was.56 (95% confidence interval=.47-.65) for ratings by individuals, and the ICC for consensus ratings was.68 (95% confidence interval=.57-.76). DISCUSSION AND CONCLUSION: The reliability of ratings of PEDro scale items varied from "fair" to "substantial," and the reliability of the total PEDro score was "fair" to "good."  相似文献   

18.
The primary purposes of this article are to review the literature on seating assessment and to describe the development of a clinical evaluation scale, the Seated Postural Control Measure (SPCM), for use with children requiring adaptive seating systems. The SPCM is an observational scale of 22 seated postural alignment items and 12 functional movement items, each scored on a four-point, criterion-referenced scale. A secondary purpose of this article is to report the reliability of the seven-point Level of Sitting Scale (LSS). Interrater and test-retest reliability of the SPCM items and the one-item LSS were evaluated on a sample of 40 children with developmental disabilities who sat with and without their seating systems. Kappa values of .75 or higher were considered excellent, .40 to .74 as fair to good, and less than .40 as poor. The interrater reliability tests for the two seated conditions and the two test sessions conducted 3 weeks apart yielded overall item Kappa coefficient means of .45 for the alignment section and .85 for the function section. Test-retest results for the SPCM items were less satisfactory, with item Kappa coefficient means for the two seating conditions and raters of .35 and .29 for alignment and function, respectively. Reliability results did not appear to be consistently better among seating conditions, raters, or test sessions. Kappa coefficients for the LSS were fair to good for both interrater and test-retest reliability. Plans for future development of the SPCM and LSS are discussed.  相似文献   

19.
The object of this study was to assess interobserver reliability in 23 tests concerning physical examination of the shoulder girdle. A physical therapist and a physical therapist/manual therapist independently performed a physical examination of the shoulder girdle in 91 patients with shoulder complaints of varying severity and duration. The observers assessed 23 items in total: active and passive abductions, passive external rotation, hand in neck (HIN) test, hand in back (HIB) test, impingement test according to Neer, springing test of the first rib and joint play test of the acromioclavicular joint. The interobserver reliability was evaluated by means of a Cohen's Kappa, the weighted Kappa and the intraclass correlation (ICC). Criteria for acceptable reliability were: Kappa value>or=0.60, ICC>or=0.75 or an absolute agreement>or=80%. The results showed that Kappa values varied from 0.09 (springing test first rib, stiffness) to 0.66 (springing test first rib, pain), weighted Kappa varied from 0.35 (pain during HIB) to 0.73 (range of motion HIB) and ICC varied from 0.54 (abduction passive starting point painful arc) to 0.96 (active and passive ranges of motion in abduction). In total 11 (48%) items fulfilled the criteria of acceptable reliability. In conclusion, there appears to be a great deal of variation in the reliability of the tests used in the physical examination of the shoulder girdle. Over 50% of the tests did not meet the statistical criteria for acceptable reliability.  相似文献   

20.
The purposes of this study were 1) to describe a clinical scale of rigidity and testing procedure for use in patients with Parkinson's disease and 2) to examine the scale's interrater reliability. Twenty subjects (3 women, 17 men; mean age = 64 years, s = 16.3) participated in the study. Criteria for participation were 1) diagnosis of Parkinson's disease, 2) physician-documented rigidity, 3) ability to follow one-step verbal directions, and 4) ability to attain at least 75% of the standard passive-range-of-motion measurements of the elbow, forearm, and wrist of the tested upper extremity. Each of two raters used a standardized set of instructions and test procedures. The degree of rigidity was assessed using a four-point scale ranging from 0 (absent) to 3 (severe). The observed agreement between raters was 16 out of 20 trials. A Cohen's weighted Kappa was used to analyze the data (Kw = .636, p = .20). Factors were identified that may have contributed to the discrepancy between agreement and the agreement beyond chance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号