首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 156 毫秒
1.
Objective  The purpose of this study was to explore the value of the standard error of measurement (SEM) in making decisions about students with examination scores at or below the pass/fail borderline in a new undergraduate medical course with an integrated assessment programme.
Methods  An analysis of de-identified, pooled data for borderline candidates was conducted to determine the SEM for each examination and the progress of candidates according to four score bands, from pass score ± 1 SEM, 1−2 SEM below the pass score, 2−3 SEM below the pass score and > 3 SEM below the pass score. The impact of poor performance in individual subject areas was also measured.
Results  Data for 1571 candidates were included in the analysis, identifying 132 students with borderline or lower scores, 45% of which were > 1 SEM below the pass score. By the third cohort the banding of students according to the SEM became highly predictive of candidate progress either through immediate remediation and re-sit examination, or by repetition of the year or withdrawal from the course.
Conclusions  The SEM is a useful tool for making confident and defensible decisions about how to manage candidates with examination scores at or below the borderline mark, as long as attention is paid to established examination design principles. The improved defensibility can be used to support a patient-safety focused decision tree or similar decision support model.  相似文献   

2.
Context  We wished to determine which factors are important in ensuring interviewers are able to make reliable and valid decisions about the non-cognitive characteristics of candidates when selecting candidates for entry into a graduate-entry medical programme using the multiple mini-interview (MMI).
Methods  Data came from a high-stakes admissions procedure. Content validity was assured by using a framework based on international criteria for sampling the behaviours expected of entry-level students. A variance components analysis was used to estimate the reliability and sources of measurement error. Further modelling was used to estimate the optimal configurations for future MMI iterations.
Results  This study refers to 485 candidates, 155 interviewers and 21 questions taken from a pre- prepared bank. For a single MMI question and 1 assessor, 22% of the variance between scores reflected candidate-to-candidate variation. The reliability for an 8-question MMI was 0.7; to achieve 0.8 would require 14 questions. Typical inter-question correlations ranged from 0.08 to 0.38. A disattenuated correlation with the Graduate Australian Medical School Admissions Test (GAMSAT) subsection 'Reasoning in Humanities and Social Sciences' was 0.26.
Conclusions  The MMI is a moderately reliable method of assessment. The largest source of error relates to aspects of interviewer subjectivity, suggesting interviewer training would be beneficial. Candidate performance on 1 question does not correlate strongly with performance on another question, demonstrating the importance of context specificity. The MMI needs to be sufficiently long for precise comparison for ranking purposes. We supported the validity of the MMI by showing a small positive correlation with GAMSAT section scores.  相似文献   

3.
Ben-David  Klass  Boulet  Champlain  King  Pohl  & Gary 《Medical education》1999,33(6):439-446
OBJECTIVES: The purpose of the study was to explore foreign medical graduates' (FMGs) performance on a clinical skills (SPX) examination. The National Board of Medical Examiners (NBME) is in the process of developing an SPX for potential use in the United States Medical Licensing Examination (USMLE). The Educational Commission for Foreign Medical Graduates (ECFMG) is developing the Clinical Skills Assessment (CSA) as an additional requirement for FMGs who wish to be certified by ECFMG. DESIGN: Thirty-three FMGs and 151 United States medical students (USMSs) took the SPX during the winter of 1996 as part of the ongoing pilot studies conducted by the NBME. Four clinical skill areas were assessed: history-taking, physical examination, communication and interpersonal skills. The examination used in this research consisted of 12 cases. The examination utilizes standardized patients (SPs) who are trained to document examinee behaviours and evaluate the communication component of the test. The SPs were also trained to evaluate the English proficiency of the candidates. Candidates were also administered the Test of Spoken English developed by the Educational Testing Services (ETS). SETTING: The examination was conducted in one medical school which served as an SPX centre for NBME pilot studies. SUBJECTS: Thirty-three foreign medical students and 151 US medical students. RESULTS: The indications were that the majority of candidates in both groups felt the examination was moderately fair but 78% of FMGs felt moderately pressed for time, vs. 80% of the USMSs who did not feel pressed for time. Reliabilities obtained for the various SPX components were somewhat higher for the FMGs reflecting the heterogeneity of this group. CONCLUSIONS: The NBME-ECFMG collaborative study yielded important information regarding the NBME SPX prototype as a performance measure for FMGs.  相似文献   

4.
Data, although limited, question the validity of the formula applied for admission to the medical school in which high school grades are the only preselection variable applied. Comparison between two groups of students from two different high school systems in Kuwait was carried out to determine if the admission criteria used currently for entry to the medical school are equally valid for both groups. The results are based on the students' performances in the first three-semester programme of medical sciences. Subjects covered were anatomy, physiology, biochemistry and behavioural sciences. The group derived from the High School National Diploma performed significantly better with a percentage pass rate of 82% while of those who were derived from the Course Credit System only 61% passed the final examination. In addition, only one of the latter group attained total marks of more than 80% compared to 12 students from the National Diploma group. The percentage failure according to subjects was consistently higher among the Course Credit graduates in all the subjects. All differences between the two groups are statistically significant (P less than 0.001).  相似文献   

5.
Performance tests are logistically complex and time consuming. To reach adequate reliability long tests are imperative. Additionally, they are very difficult to adapt to the individual learning paths of students, which is necessary in problem-based learning. This study investigates a written alternative to performance-based tests. A Knowledge Test of Skills (KTS) was developed and administered to 380 subjects of various educational levels, including both first-year students and recently graduated doctors. By comparing KTS scores with scores on performance tests strong convergent validity was demonstrated. The KTS failed discriminant validity when compared with a general medical knowledge test. Also the identification of sub-tests discriminating between behavioural and cognitive aspects was not successful. This was due to the interdependence of the constructs measured. The KTS was able to demonstrate differences in ability level and showed subtle changes in response patterns over items, indicating construct validity. It was concluded that the KTS is a valid instrument for predicting performance scores and could very well be applied as supplementary information to performance testing. The relative ease of construction and efficiency makes the KTS a suitable substitute instrument for research purposes. The study also showed that in higher ability levels the concepts which were meant to be measured were highly related, giving evidence to the general factor theory of competence. However, it appeared that this general factor was originally non-existent in first-year students and that these competencies integrate as the educational process develops.  相似文献   

6.
Student-written items were compared with teacher-written items on an objective examination given to first year medical students. While student scores were higher on the student-written items than on teacher-written items, there was a positive correlation between the scores. Student items did not differ from teacher items in the course, according to student ratings of emphasis. As a by-product of this study, correlations were found which suggest that student scores on an item often reflect the degree of teaching emphasis given the content area of the item rather than the inherent difficulty of the content. It is suggested that further research is needed to determine whether students learn through the process of writing examination items. Therefore, if the process proves to be educational, this study indicates that it will be feasible to incorporate the student-constructed items in examinations.  相似文献   

7.
OBJECTIVES: This study was designed to describe the variation in marking tendencies among different examiners in an oral examination. DESIGN: Marks awarded in a family practice board examination between 1984 and 1996 were analysed, relating to 5328 examination sessions graded by 94 examiners. Examiners were ranked by the rates at which they awarded 'fail', 'pass' or 'distinction' grades. The effects of examiners' gender, experience, academic rank, regional affiliation and country of qualification on examiner behaviour were studied. SETTING: National Family Medicine Examination Board, Scientific Council, Israel Medical Association. SUBJECTS: Oral examiners. RESULTS: Eighteen per cent of examiners were classified as 'tough', being in the lowest tertile for 'distinction' rates and the highest tertile for 'failure' rates; 19% were classified as 'mild'; 52% were 'regular', falling in the middle tertile for both distinction and failure rates. Four per cent of examiners were in the top tertile for both distinctions and failures, labelled 'extremists', and 6% were in the bottom tertile for both, and were labelled 'noncommittal'. Higher failure rates were associated with examiners' academic rank, experience and graduation from an English-speaking medical school. CONCLUSIONS: Examiners differ significantly in their degree of severity. Those who demonstrate clearly deviant patterns of grading should be withdrawn. Candidates should be presented with a balanced panel of examiners, and a degree of standardization of content should be introduced into oral examinations.  相似文献   

8.
A questionnaire survey was conducted on the nature of the oral examinations in different disciplines in the medical schools in Sri Lanka. A total of 352 students from Peradeniya and Jaffna medical faculties and pre-registration house officers, including Colombo faculty graduates of the two teaching hospitals, responded to the questionnaire. The results of the survey, which included twelve disciplines, reveal that the time duration of the oral encounter ranged from 10 to 20 minutes. The number of questions asked ranged from five to nine. Detailed analysis of the intellectual level of the questions showed that more than 63% of the questioning was at simple recall level and none at the level of problem-solving. These results show that the oral examination in addition to its inherent weakness of low reliability and objectivity also lacks validity in terms of content sampling. Its predictive validity of professional competence, which requires problem-solving skills, is questionable. Content analysis of the items also revealed that all the abilities tested in the orals could best be tested in a pen-and-paper examination or a structured practical or clinical examination.  相似文献   

9.
OBJECTIVE: To evaluate the use of a modified version of the Leicester Assessment Package (LAP) in the formative assessment of the consultation performance of medical students with particular reference to validity, inter-assessor reliability, acceptability, feasibility and educational impact. DESIGN: 180 third and fourth year Leicester medical students were directly observed consulting with six general practice patients and independently assessed by a pair of assessors. A total of 70 practice and 16 departmental assessors took part. Performance scores were subjected to generalizability analysis and students' views of the assessment were gathered by questionnaire. RESULTS: Four of the five categories of consultation performance (Interviewing and history taking, Patient management, Problem solving and Behaviour and relationship with patients) were assessed in over 99% of consultations and Physical examination was assessed in 94%. Seventy-six percent of assessors reported that the case mix was 'satisfactory' and 20% that it was 'borderline'; 85% of students believed it to have been satisfactory. Generalizability analysis indicates that two independent assessors assessing the performance of students across six consultations would achieve a reliability of 0.94 in making pass or fail decisions. Ninety-eight percent of students perceived that their particular strengths and weaknesses were correctly identified, 99% that they were given specific advice on how to improve their performance and 98% believed that the feedback they had received would have long-term benefit. CONCLUSIONS: The modified version of the LAP is valid, reliable and feasible in formative assessment of the consultation performance of medical students. Furthermore, almost all students found the process fair and believed it was likely to lead to improvements in their consultation performance. This approach may also be applicable to regulatory assessment as it accurately identifies students at the pass/fail margin.  相似文献   

10.
Scores on nineteen pre-admission and post-admission performance variables for four classes of medical students were analysed using canonical correlation and regression methods. Detailed interviews with 138 clinical preceptors were included among the criterion variables. National Board Part I scores could be predicted readily from conventional data such as Medical College Admission Test scores and grade point average. However, these same predictors generally correlated negatively with measures of clinical performance. Evidence supports pre-admission interviews and careful analysis of letters of recommendation as useful predictors of the clinical performance variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号