首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This booklet aims to provide relevant background information and guidelines for medical school teachers in clinical departments charged with assessing the clinical competence of undergraduate students. It starts by emphasizing the difference between clinical competence and clinical performance. An approach to defining what should be assessed is outlined. The technical considerations of validity, reliability and practicability are discussed with reference to the ward- or practice-based setting and to the examination setting. The various methods available to assess aspects of competence are described and their strengths and weaknesses reviewed. The paper concludes with a discussion of the important issues of scoring and standard setting. The conclusion is reached that the quality of many current assessments could be improved. To do so will require a multi-format approach using both the practice and examination settings. Some of the traditional methods will have to be abandoned or modified and new methods introduced.  相似文献   

2.
3.
This paper describes a situation where an alteration in the final-year assessment scheme led to changes in student learning activities which were the exact opposite of those intended. Students were seen to be spending a disproportionate amount of time studying the theoretical components of the course relative to the practical and clinical aspects. The paramount importance of the assessments and examinations in influencing student learning behaviour led the departments concerned to develop a new clinical examination which more clearly reflected the objectives of the course.
A questionnaire survey was undertaken to determine how the different sections of the final assessment affected the students' approach to studying. The questionnaire was administered to graduates during their intern year for the 3 years following the introduction of the new clinical examination. Results were also obtained for the year preceding the change. The survey showed that the students developed a high regard for the new examination and its validity as a test of clinical competence. The students found that an increase in ward-based learning activities was essential for success in the final examinations. The new clinical examination has thus influenced students' learning and successfully restored the balance of their learning activities between the clinical and theoretical components of the course.  相似文献   

4.
The assessment of clinical competence has traditionally been carried out through standard evaluations such as multiple choice question and bedside oral examinations. The attributes which constitute clinical competence are multidimensional, and we have modified the objective structured clinical examination (OSCE) to measure these various competencies. We have evaluated the validity and reliability of the OSCE in a paediatric clinical clerkship. We divided the examination into the four components of competence (clinical skills, problem-solving, knowledge, and patient management) and evaluated the performance of 77 fourth-year medical students. The skill and content domains of the OSCE were carefully defined, agreed upon, sampled and reproduced. This qualitative evaluation of the examination was both adequate and appropriate. We achieved both acceptable interstation and intertask reliability. When correlated with concurrent methods of evaluation we found the OSCE to be an accurate measure of paediatric knowledge and patient management skills. The OSCE did not correlate, however, with traditional measures of clinical skills including history-taking and physical examination. Our OSCE, as outlined, offers an objective means of identifying weaknesses and strengths in specific areas of clinical competence and is therefore an important addition to the traditional tools of evaluation.  相似文献   

5.
6.
OBJECTIVES: (i) To design a new, quick and efficient method of assessing specific cognitive aspects of trainee clinical communication skills, to be known as the Objective Structured Video Exam (OSVE) (Study 1); (ii) to prepare a scoring scheme for markers (Study 2); and (iii) to determine reliability and evidence for validity of the OSVE (Study 3). METHODS: Study 1 describes how the exam was designed. The OSVE assesses the student's recognition and understanding of the consequences of various communication skills. In addition, the assessment taps the number of alternative skills that the student believes will be of assistance in improving the patient-doctor interaction. Study 2 outlines the scoring system that is based on a range of 50 marks. Study 3 reports inter-rater consistency and presents evidence to support the validity of the new assessment by associating the marks from 607 1st year undergraduate medical students with their performance ratings in a communication skills OSCE. SETTING: Medical school, The University of Liverpool. RESULTS: Preparation of a scoring scheme for the OSVE produced consistent marking. The reliability of the marking scheme was high (ICC=0.94). Evidence for the construct validity of the OSVE was found when a moderate predicted relationship of the OSVE to interviewing behaviour in the communication skills OSCE was shown (r=0.17, P < 0.001). CONCLUSION: A new video-based written examination (the OSVE) that is efficient and quick to administer was shown to be reliable and to demonstrate some evidence for validity.  相似文献   

7.
The objective structured clinical examination (OSCE) is increasingly being used as a method of clinical assessment yet its measurement characteristics have not been well documented. Evidence is accumulating that many OSCEs may be too short to achieve reliable results. This paper reports detailed psychometric analyses of OSCEs which were administered as part of a well-established final-year examination. Generalizability theory guided investigation of test reliability. At the present test length the OSCE components showed low reliabilities relative to written components. Satisfactory reliabilities could potentially be achieved if test length was increased to approximately 6 hours, a time which would create significant logistic problems for most medical schools. Several strategies for dealing with this practical problem have been explored. Firstly, it was shown that more careful selection of stations based on their psychometric characteristics can significantly improve reliability. Secondly, where rater availability is a limiting factor to increasing test length, more can be gained by using one rater per station and having more stations than using two raters per station. Finally, OSCE scores can, with advantage, be combined with other test scores which are obtained by using less resource-intensive methods. By adopting such strategies, a reliable assessment of clinical competence could be obtained in about 4 hours of testing time which was equally divided between an OSCE constructed of practical and clinical stations and a written test.  相似文献   

8.
BACKGROUND: Clinical examinations increasingly consist of composite tests to assess all aspects of the curriculum recommended by the General Medical Council. SETTING: A final undergraduate medical school examination for 214 students. AIM: To estimate the overall reliability of a composite examination, the correlations between the tests, and the effect of differences in test length, number of items and weighting of the results on the reliability. METHOD: The examination consisted of four written and two clinical tests: multiple-choice questions (MCQ) test, extended matching questions (EMQ), short-answer questions (SAQ), essays, an objective structured clinical examination (OSCE) and history-taking long cases. Multivariate generalizability theory was used to estimate the composite reliability of the examination and the effects of item weighting and test length. RESULTS: The composite reliability of the examination was 0.77, if all tests contributed equally. Correlations between examination components varied, suggesting that different theoretically interpretable parameters of competence were being tested. Weighting tests according to items per test or total test time gave improved reliabilities of 0.93 and 0.81, respectively. Double weighting of the clinical component marginally affected the reliability (0.76). CONCLUSION: This composite final examination achieved an overall reliability sufficient for high-stakes decisions on student clinical competence. However, examination structure must be carefully planned and results combined with caution. Weighting according to number of items or test length significantly affected reliability. The components testing different aspects of knowledge and clinical skills must be carefully balanced to ensure both content validity and parity between items and test length.  相似文献   

9.
10.
The predictive validity of 'traditional' tools utilized in the selection of medical students was evaluated in a 'non-traditional' selection paradigm, where a wide range of previous-academic ability was represented. The validity of the use of pre-academic grades and examination scores in the prediction of success in clinical performance was examined in a medical school which de-emphasizes these indicators and emphasizes personal characteristics assessed via interview ratings in student selection. Grades and examination scores were found to have no relation to clinical ratings which have an added interpersonal and community emphasis during the fourth-sixth years of medical school. A positive trend was found for interview ratings with clinical performance, but the skewed nature of interview scores was seen as limiting investigation of this variable. The meaning of these results vis-à-vis the continued use of academic and examination related selection criteria was discussed.  相似文献   

11.
Patient management problems (PMP) are being used in medical examinations with increasing frequency despite evidence which throws doubt on their validity as measures of clinical competence. This study investigated the construct validity of a PMP constructed in both written and interview formats. Each test was administered to groups of students of different seniorities and to two groups of Docotor, interns and post-interns. The pattern of scores for the different groups was not that expected of a valid test of competence. The most competent groups (the postinterns) generally scored less well on the calculated indices than the senior students and interns. These findings were similar for both formats of the test so cueing was not thought to be the major factor. It appears that the scoring system is at fault.
A comparison of performance on the written and interview (uncued) formats showed that many more options were chosen by all groups tested on the written PMP.
It was concluded that written PMPs cannot yet be regarded as a valid simulation of clinical performance. Although content validity is high this does not appear to be so for construct validity or concurrent validity.  相似文献   

12.
CONTEXT: The College of Medicine and Medical Sciences at the Arabian Gulf University, Bahrain, replaced the traditional long case/short case clinical examination on the final MD examination with a direct observation clinical encounter examination (DOCEE). Each student encountered four real patients. Two pairs of examiners from different disciplines observed the students taking history and conducting physical examinations and jointly assessed their clinical competence. OBJECTIVES: To determine the reliability and validity of the DOCEE by investigating whether examiners agree when scoring, ranking and classifying students; to determine the number of cases and examiners necessary to produce a reliable examination, and to establish whether the examination has content and concurrent validity. SUBJECTS: Fifty-six final year medical students and 22 examiners (in pairs) participated in the DOCEE in 2001. METHODS: Generalisability theory, intraclass correlation, Pearson correlation and kappa were used to study reliability and agreement between the examiners. Case content and Pearson correlation between DOCEE and other examination components were used to study validity. RESULTS: Cronbach's alpha for DOCEE was 0.85. The intraclass and Pearson correlation of scores given by specialists and non-specialists ranged from 0.82 to 0.93. Kappa scores ranged from 0.56 to 1.00. The overall intraclass correlation of students' scores was 0.86. The generalisability coefficient with four cases and two raters was 0.84. Decision studies showed that increasing the cases from one to four improved reliability to above 0.8. However, increasing the number of raters had little impact on reliability. The use of a pre-examination blueprint for selecting the cases improved the content validity. The disattenuated Pearson correlations between DOCEE and other performance measures as a measure of concurrent validity ranged from 0.67 to 0.79. CONCLUSIONS: The DOCEE was shown to have good reliability and interrater agreement between two independent specialist and non-specialist examiners on the scoring, ranking and pass/fail classification of student performance. It has adequate content and concurrent validity and provides unique information about students' clinical competence.  相似文献   

13.
OBJECTIVE: To describe the weaknesses of the current psychometric approach to assessment as a scientific model. DISCUSSION: The current psychometric model has played a major role in improving the quality of assessment of medical competence. It is becoming increasingly difficult, however, to apply this model to modern assessment methods. The central assumption in the current model is that medical competence can be subdivided into separate measurable stable and generic traits. This assumption has several far-reaching implications. Perhaps the most important is that it requires a numerical and reductionist approach, and that aspects such as fairness, defensibility and credibility are by necessity mainly translated into reliability and construct validity. These approaches are more and more difficult to align with modern assessment approaches such as mini-CEX, 360-degree feedback and portfolios. This paper describes some of the weaknesses of the psychometric model and aims to open a discussion on a conceptually different statistical approach to quality of assessment. FUTURE DIRECTIONS: We hope that the discussion opened by this paper will lead to the development of a conceptually different statistical approach to quality of assessment. A probabilistic or Bayesian approach would be worth exploring.  相似文献   

14.
CONTEXT: Contemporary studies have shown that traditional medical school admissions interviews have strong face validity but provide evidence for only low reliability and validity. As a result, they do not provide a standardised, defensible and fair process for all applicants. METHODS: In 2006, applicants to the University of Calgary Medical School were interviewed using the multiple mini-interview (MMI). This interview process consisted of 9, 8-minute stations where applicants were presented with scenarios they were then asked to discuss. This was followed by a single 8-minute station that allowed the applicant to discuss why he or she should be admitted to our medical school. Sociodemographic and station assessment data provided for each applicant were analysed to determine whether the MMI was a valid and reliable assessment of the non-cognitive attributes, distinguished between the non-cognitive attributes, and discriminated between those accepted and those placed on the waitlist (waiting list). We also assessed whether applicant sociodemographic characteristics were associated with acceptance or waitlist status. RESULTS: Cronbach's alpha for each station ranged from 0.97-0.98. Low correlations between stations and the factor analysis suggest each station assessed different attributes. There were significant differences in scores between those accepted and those on the waitlist. Sociodemographic differences were not associated with status on acceptance or waiting lists. DISCUSSION: The MMI is able to assess different non-cognitive attributes and our study provides additional evidence for its reliability and validity. The MMI offers a fairer and more defensible assessment of applicants to medical school than the traditional interview.  相似文献   

15.
A questionnaire survey was conducted on the nature of the oral examinations in different disciplines in the medical schools in Sri Lanka. A total of 352 students from Peradeniya and Jaffna medical faculties and pre-registration house officers, including Colombo faculty graduates of the two teaching hospitals, responded to the questionnaire. The results of the survey, which included twelve disciplines, reveal that the time duration of the oral encounter ranged from 10 to 20 minutes. The number of questions asked ranged from five to nine. Detailed analysis of the intellectual level of the questions showed that more than 63% of the questioning was at simple recall level and none at the level of problem-solving. These results show that the oral examination in addition to its inherent weakness of low reliability and objectivity also lacks validity in terms of content sampling. Its predictive validity of professional competence, which requires problem-solving skills, is questionable. Content analysis of the items also revealed that all the abilities tested in the orals could best be tested in a pen-and-paper examination or a structured practical or clinical examination.  相似文献   

16.
Newble D 《Medical education》2004,38(2):199-203
The traditional clinical examination has been shown to have serious limitations in terms of its validity and reliability. The OSCE provides some answers to these limitations and has become very popular. Many variants on the original OSCE format now exist and much research has been done on various aspects of their use. Issues to be addressed relate to organization matters and to the quality of the assessment. This paper focuses particularly on the latter with respect to ways of ensuring content validity and achieving acceptable levels of reliability. A particular concern has been the demonstrable need for long examinations if high levels of reliability are to be achieved. Strategies for reducing the practical difficulties this raises are discussed. Standard setting methods for use with OSCEs are described.  相似文献   

17.
Performance tests are logistically complex and time consuming. To reach adequate reliability long tests are imperative. Additionally, they are very difficult to adapt to the individual learning paths of students, which is necessary in problem-based learning. This study investigates a written alternative to performance-based tests. A Knowledge Test of Skills (KTS) was developed and administered to 380 subjects of various educational levels, including both first-year students and recently graduated doctors. By comparing KTS scores with scores on performance tests strong convergent validity was demonstrated. The KTS failed discriminant validity when compared with a general medical knowledge test. Also the identification of sub-tests discriminating between behavioural and cognitive aspects was not successful. This was due to the interdependence of the constructs measured. The KTS was able to demonstrate differences in ability level and showed subtle changes in response patterns over items, indicating construct validity. It was concluded that the KTS is a valid instrument for predicting performance scores and could very well be applied as supplementary information to performance testing. The relative ease of construction and efficiency makes the KTS a suitable substitute instrument for research purposes. The study also showed that in higher ability levels the concepts which were meant to be measured were highly related, giving evidence to the general factor theory of competence. However, it appeared that this general factor was originally non-existent in first-year students and that these competencies integrate as the educational process develops.  相似文献   

18.
CONTEXT: Writing is an important skill for practitioners and students, yet this is a skill rarely taught in a formal capacity at medical school. At the University of Adelaide many students are from non-English speaking backgrounds and have varying proficiencies in English. We wished to devise a method and instrument which could identify students who may benefit from formative feedback and tuition in writing. OBJECTIVES AND METHOD: Students' written account of a short clinical interview with a standardized patient was assessed using a new instrument (the Written Language Rating Scale) designed especially for this study. The assessment of writing was made by one rater with qualifications in teaching English as a second language. SUBJECTS: 127 second-year medical students enrolled at the University of Adelaide, Australia. INSTRUMENTS AND RESULTS: The scale appeared to have good internal consistency, face and construct validity, and test security was not an issue. However, it had questionable concurrent validity with a standardized language test, although this may be partly due to the period of time which had elapsed between administration of the two tests. CONCLUSIONS: This study was useful in providing a means to objectively rate students' written English language skills and to target students in need of formative feedback and tuition. However, further research is necessary for both evaluation of medical writing and interventions for its improvement.  相似文献   

19.
OBJECTIVE: To design a clinical examination of high content validity suitable for use as a formative assessment tool with pre-registration house officers (PRHO'S) towards the end of their first house officer post. DESIGN: A multicentre collaboration between four UK medical schools who offer undergraduate curricula which are problem-based, systems-based, patient-orientated, student-centred, jargon-laden and utterly staff-bewildering. MAIN OUTCOME: An objective structured clinical examination (OSCE) which is suitable for use with graduates of UK medical schools. It assesses the knowledge, skills and attitudes essential for future careers in a hierarchical system where protecting the senior staff from all forms of irritation is paramount. RESULTS: PRHO'S who excel in this examination get better references. CONCLUSION: The OSCE format can be used to provide 'real-life' scenarios appropriate to the season.  相似文献   

20.
The inconsistency of the marking in clinical examinations is a well documented problem. This project identified some of the factors responsible for this inconsistency.
A standardized rating situation was devised. Five students were videotaped as they performed part of a physical examination on simulated patients. Eighteen experienced medical and surgical examiners rated their performances using an objective checklist type of rating form. No differences were evident between physicians and surgeons. The group of examiners was divided into three subgroups, one receiving no training, one limited training and one more extensive training. Examiners re-rated the same students 2 months after the first rating.
Inter-rater reliability was satisfactory for the first ratings and training produced no significant improvement. A substantial improvement was achieved by identifying the most inconsistent raters and removing them from the analysis. Training was shown to be unnecessary for consistent examiners and ineffective for examiners who were less consistent. On the basis of these results, only consistent examiners were selected to take part in the interactive component of the objective structured final year examinations. The ratings in these examinations achieved high levels of inter-rater reliability.
It was concluded that the combination of an objective check-list rating form, a controlled test situation and the selection of inherently consistent examiners could solve the problem of inconsistent marking in clinical examinations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号