期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Developing tailored instruments: item banking and computerized adaptive assessment

Bjorner Jakob Bue Chang Chih-Hung Thissen David Reeve Bryce B. 《Quality of life research》2007,16(1):95-108

Item banks and Computerized Adaptive Testing (CAT) have the potential to greatly improve the assessment of health outcomes. This review describes the unique features of item banks and CAT and discusses how to develop item banks. In CAT, a computer selects the items from an item bank that are most relevant for and informative about the particular respondent; thus optimizing test relevance and precision. Item response theory (IRT) provides the foundation for selecting the items that are most informative for the particular respondent and for scoring responses on a common metric. The development of an item bank is a multi-stage process that requires a clear definition of the construct to be measured, good items, a careful psychometric analysis of the items, and a clear specification of the final CAT. The psychometric analysis needs to evaluate the assumptions of the IRT model such as unidimensionality and local independence; that the items function the same way in different subgroups of the population; and that there is an adequate fit between the data and the chosen item response models. Also, interpretation guidelines need to be established to help the clinical application of the assessment. Although medical research can draw upon expertise from educational testing in the development of item banks and CAT, the medical field also encounters unique opportunities and challenges. 相似文献

2.

Assessment of health-related quality of life in arthritis: conceptualization and development of five item banks using item response theory

Jacek A Kopec Eric C Sayre Aileen M Davis Elizabeth M Badley Michal Abrahamowicz Lesley Sherlock J Ivan Williams Aslam H Anis John M Esdaile 《Health and quality of life outcomes》2006,4(1):1-17

Background

Modern psychometric methods based on item response theory (IRT) can be used to develop adaptive measures of health-related quality of life (HRQL). Adaptive assessment requires an item bank for each domain of HRQL. The purpose of this study was to develop item banks for five domains of HRQL relevant to arthritis.

Methods

About 1,400 items were drawn from published questionnaires or developed from focus groups and individual interviews and classified into 19 domains of HRQL. We selected the following 5 domains relevant to arthritis and related conditions: Daily Activities, Walking, Handling Objects, Pain or Discomfort, and Feelings. Based on conceptual criteria and pilot testing, 219 items were selected for further testing. A questionnaire was mailed to patients from two hospital-based clinics and a stratified random community sample. Dimensionality of the domains was assessed through factor analysis. Items were analyzed with the Generalized Partial Credit Model as implemented in Parscale. We used graphical methods and a chi-square test to assess item fit. Differential item functioning was investigated using logistic regression.

Results

Data were obtained from 888 individuals with arthritis. The five domains were sufficiently unidimensional for an IRT-based analysis. Thirty-one items were deleted due to lack of fit or differential item functioning. Daily Activities had the narrowest range for the item location parameter (-2.24 to 0.55) and Handling Objects had the widest range (-1.70 to 2.27). The mean (median) slope parameter for the items ranged from 1.15 (1.07) in Feelings to 1.73 (1.75) in Walking. The final item banks are comprised of 31–45 items each.

Conclusion

We have developed IRT-based item banks to measure HRQL in 5 domains relevant to arthritis. The items in the final item banks provide adequate psychometric information for a wide range of functional levels in each domain. 相似文献

3.

Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank

Haley SM Ni P Hambleton RK Slavin MD Jette AM 《Journal of clinical epidemiology》2006,59(11):1174-1182

BACKGROUND AND OBJECTIVE: Measuring physical functioning (PF) within and across postacute settings is critical for monitoring outcomes of rehabilitation; however, most current instruments lack sufficient breadth and feasibility for widespread use. Computer adaptive testing (CAT), in which item selection is tailored to the individual patient, holds promise for reducing response burden, yet maintaining measurement precision. We calibrated a PF item bank via item response theory (IRT), administered items with a post hoc CAT design, and determined whether CAT would improve accuracy and precision of score estimates over random item selection. METHODS: 1,041 adults were interviewed during postacute care rehabilitation episodes in either hospital or community settings. Responses for 124 PF items were calibrated using IRT methods to create a PF item bank. We examined the accuracy and precision of CAT-based scores compared to a random selection of items. RESULTS: CAT-based scores had higher correlations with the IRT-criterion scores, especially with short tests, and resulted in narrower confidence intervals than scores based on a random selection of items; gains, as expected, were especially large for low and high performing adults. CONCLUSION: The CAT design may have important precision and efficiency advantages for point-of-care functional assessment in rehabilitation practice settings. 相似文献

4.

Simulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function

Hart DL Mioduski JE Werneke MW Stratford PW 《Journal of clinical epidemiology》2006,59(9):947-956

OBJECTIVE: To equate physical functioning (PF) items with Back Pain Functional Scale (BPFS) items, develop a computerized adaptive test (CAT) designed to assess lumbar spine functional status (LFS) in people with lumbar spine impairments, and compare discriminant validity of LFS measures (theta(IRT)) generated using all items analyzed with a rating scale Item Response Theory model (RSM) and measures generated using the simulated CAT (theta(CAT)). METHODS: We performed a secondary analysis of retrospective intake rehabilitation data. RESULTS: Unidimensionality and local independence of 25 BPFS and PF items were supported. Differential item functioning was negligible for levels of symptom acuity, gender, age, and surgical history. The RSM fit the data well. A lumbar spine specific CAT was developed that was 72% more efficient than using all 25 items to estimate LFS measures. theta(IRT) and theta(CAT) measures did not discriminate patients by symptom acuity, age, or gender, but discriminated patients by surgical history in similar clinically logical ways. theta(CAT) measures were as precise as theta(IRT) measures. CONCLUSION: A body part specific simulated CAT developed from an LFS item bank was efficient and produced precise measures of LFS without eroding discriminant validity. 相似文献

5.

Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks,child-report,and parent-proxy editions

Forrest Christopher B. Devine Janine Bevans Katherine B. Becker Brandon D. Carle Adam C. Teneralli Rachel E. Moon JeanHee Tucker Carole A. Ravens-Sieberer Ulrike 《Quality of life research》2018,27(1):217-234

相似文献

6.

Item banking to improve, shorten and computerize self-reported fatigue: An illustration of steps to create a core item bank from the FACIT-Fatigue Scale 总被引：3，自引：0，他引：3

Jin-shei Lai David Cella Chih-Hung Chang Rita K. Bode Allen W. Heinemann 《Quality of life research》2003,12(5):485-501

Fatigue is a common symptom among cancer patients and the general population. Due to its subjective nature, fatigue has been difficult to effectively and efficiently assess. Modern computerized adaptive testing (CAT) can enable precise assessment of fatigue using a small number of items from a fatigue item bank. CAT enables brief assessment by selecting questions from an item bank that provide the maximum amount of information given a person's previous responses. This article illustrates steps to prepare such an item bank, using 13 items from the Functional Assessment of Chronic Illness Therapy Fatigue Subscale (FACIT-F) as the basis. Samples included 1022 cancer patients and 1010 people from the general population. An Item Response Theory (IRT)-based rating scale model, a polytomous extension of the Rasch dichotomous model was utilized. Nine items demonstrating acceptable psychometric properties were selected and positioned on the fatigue continuum. The fatigue levels measured by these nine items along with their response categories covered 66.8% of the general population and 82.6% of the cancer patients. Although the operational CAT algorithms to handle polytomously scored items are still in progress, we illustrated how CAT may work by using nine core items to measure level of fatigue. Using this illustration, a fatigue measure comparable to its full-length 13-item scale administration was obtained using four items. The resulting item bank can serve as a core to which will be added a psychometrically sound and operational item bank covering the entire fatigue continuum. 相似文献

7.

Development and testing of item response theory-based item banks and short forms for eye,skin and lung problems in sarcoidosis

David E. Victorson Seung Choi Marc A. Judson David Cella 《Quality of life research》2014,23(4):1301-1313

Purpose

Sarcoidosis is a multisystem disease that can negatively impact health-related quality of life (HRQL) across generic (e.g., physical, social and emotional wellbeing) and disease-specific (e.g., pulmonary, ocular, dermatologic) domains. Measurement of HRQL in sarcoidosis has largely relied on generic patient-reported outcome tools, with little disease-specific measures available. The purpose of this paper is to present the development and testing of disease-specific item banks and short forms of lung, skin and eye problems, which are a part of a new patient-reported outcome (PRO) instrument called the sarcoidosis assessment tool.

Methods

After prioritizing and selecting the most important disease-specific domains, we wrote new items to reflect disease-specific problems by drawing from patient focus group and clinician expert survey data that were used to create our conceptual model of HRQL in sarcoidosis. Item pools underwent cognitive interviews by sarcoidosis patients (n = 13), and minor modifications were made. These items were administered in a multi-site study (n = 300) to obtain item calibrations and create calibrated short forms using item response theory (IRT) approaches.

Results

From the available item pools, we created four new item banks and short forms: (1) skin problems, (2) skin stigma, (3) lung problems, and (4) eye Problems. We also created and tested supplemental forms of the most common constitutional symptoms and negative effects of corticosteroids.

Conclusions

Several new sarcoidosis-specific PROs were developed and tested using IRT approaches. These new measures can advance more precise and targeted HRQL assessment in sarcoidosis clinical trials and clinical practice. 相似文献

8.

Scoring based on item response theory did not alter the measurement ability of EORTC QLQ-C30 scales

Petersen MA Groenvold M Aaronson N Brenne E Fayers P Nielsen JD Sprangers M Bjorner JB;European Organisation for Research Treatment of Cancer Quality of Life Group 《Journal of clinical epidemiology》2005,58(9):902-908

BACKGROUND AND OBJECTIVES: Most health-related quality-of-life questionnaires include multi-item scales. Scale scores are usually estimated as simple sums of the item scores. However, scoring procedures utilizing more information from the items might improve measurement abilities, and thereby reduce the needed sample sizes. We investigated whether item response theory (IRT)-based scoring improved the measurement abilities of the EORTC QLQ-C30 physical functioning, emotional functioning, and fatigue scales. METHODS: Using a database of 13,010 subjects we estimated the relative validities of IRT scoring compared to sum scoring of the scales. RESULTS: The mean relative validities were 1.04 (physical), 1.03 (emotional), and 0.97 (fatigue). None of these were significantly larger than 1. Thus, no gain in measurement abilities using IRT scoring was found for these scales. Possible explanations include that the items in the scales are not constructed for IRT scoring and that the scales are relatively short. CONCLUSION: IRT scoring of the three longest EORTC QLQ-C30 scales did not improve measurement abilities compared to the traditional sum scoring of the scales. 相似文献

9.

Patient-reported outcomes measurement information system (PROMIS) domain names and definitions revisions: further evaluation of content validity in IRT-derived item banks

William T. Riley Nan Rothrock Bonnie Bruce Christopher Christodolou Karon Cook Elizabeth A. Hahn David Cella 《Quality of life research》2010,19(9):1311-1321

Purpose

Content validity of patient-reported outcomes (PROs) is evaluated primarily during item development, but subsequent psychometric analyses, particularly for item response theory (IRT)-derived scales, often result in considerable item pruning and potential loss of content. After selecting items for the PROMIS banks based on psychometric and content considerations, we invited external content expert reviews of the degree to which the initial domain names and definitions represented the calibrated item bank content. 相似文献

10.

PROMIS<Superscript>®</Superscript> Parent Proxy Report Scales for children ages 5–7 years: an item response theory analysis of differential item functioning across age groups

James W. Varni David Thissen Brian D. Stucky Yang Liu Brooke Magnus Hally Quinn Debra E. Irwin Esi Morgan DeWitt Jin-Shei Lai Dagmar Amtmann Heather E. Gross Darren A. DeWalt 《Quality of life research》2014,23(1):349-361

Objective

The objective of the present study is to describe the extension of the National Institutes of Health Patient-Reported Outcomes Measurement Information System (PROMIS^®) pediatric parent proxy-report item banks for parents of children ages 5–7 years, and to investigate differential item functioning (DIF) between the data obtained from parents of 5–7-year-old children with the data obtained from parents of 8–17 year-old children in the original construction of the scales.

Methods

Item response theory (IRT) analyses of DIF were conducted comparing data from the 5–7 age group with data from the established scales for ages 8–17 across 5 generic health domains (physical functioning, pain, fatigue, emotional health, and social health) and asthma.

Results

IRT DIF analyses revealed that the majority of the items functioned similarly with responses from parents of younger and older children. A small number of items were removed from the item bank for younger children, and a few items that exhibited statistical DIF were retained in the pools with the caveat that they should not be used in studies that involve comparisons of younger children with older children.

Conclusions

The study confirms that most of the items in the PROMIS parent proxy-report item banks can be used with parents of children ages 5–7. It is anticipated that these new scales will have application for younger pediatric populations when pediatric self-report is not feasible.

相似文献

11.

Simulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments

Hart DL Mioduski JE Stratford PW 《Journal of clinical epidemiology》2005,58(6):629-638

BACKGROUND AND OBJECTIVE: To develop computerized adaptive tests (CATs) designed to assess lower extremity functional status (FS) in people with lower extremity impairments using items from the Lower Extremity Functional Scale and compare discriminant validity of FS measures generated using all items analyzed with a rating scale Item Response Theory model (theta(IRT)) and measures generated using the simulated CATs (theta(CAT)). METHODS: Secondary analysis of retrospective intake rehabilitation data. RESULTS: Unidimensionality of items was strong, and local independence of items was adequate. Differential item functioning (DIF) affected item calibration related to body part, that is, hip, knee, or foot/ankle, but DIF did not affect item calibration for symptom acuity, gender, age, or surgical history. Therefore, patients were separated into three body part specific groups. The rating scale model fit all three data sets well. Three body part specific CATs were developed: each was 70% more efficient than using all LEFS items to estimate FS measures. theta(IRT) and theta(CAT) measures discriminated patients by symptom acuity, age, and surgical history in similar ways. theta(CAT) measures were as precise as theta(IRT) measures. CONCLUSION: Body part-specific simulated CATs were efficient and produced precise measures of FS with good discriminant validity. 相似文献

12.

Measuring social health in the patient-reported outcomes measurement information system (PROMIS): item bank development and testing

Elizabeth A. Hahn Robert F. DeVellis Rita K. Bode Sofia F. Garcia Liana D. Castel Susan V. Eisen Hayden B. Bosworth Allen W. Heinemann Nan Rothrock David Cella 《Quality of life research》2010,19(7):1035-1044

Purpose

To develop a social health measurement framework, to test items in diverse populations and to develop item response theory (IRT) item banks.

Methods

A literature review guided framework development of Social Function and Social Relationships sub-domains. Items were revised based on patient feedback, and Social Function items were field-tested. Analyses included exploratory factor analysis (EFA), confirmatory factor analysis (CFA), two-parameter IRT modeling and evaluation of differential item functioning (DIF).

Results

The analytic sample included 956 general population respondents who answered 56 Ability to Participate and 56 Satisfaction with Participation items. EFA and CFA identified three Ability to Participate sub-domains. However, because of positive and negative wording, and content redundancy, many items did not fit the IRT model, so item banks do not yet exist. EFA, CFA and IRT identified two preliminary Satisfaction item banks. One item exhibited trivial age DIF.

Conclusion

After extensive item preparation and review, EFA-, CFA- and IRT-guided item banks help provide increased measurement precision and flexibility. Two Satisfaction short forms are available for use in research and clinical practice. This initial validation study resulted in revised item pools that are currently undergoing testing in new clinical samples and populations. 相似文献

13.

Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients

Linda?Dirven Mogens?Groenvold Martin?J.?B.?Taphoorn Thierry?Conroy Krzysztof?A.?Tomaszewski Teresa?Young Morten?Aa.?Petersen 《Quality of life research》2017,26(11):2919-2929

Background

The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF).

Methods

In previous phases (I–III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties.

Results

A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35–40% compared to the original QLQ-C30 CF scale, without loss of power.

Conclusion

A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.

相似文献

14.

Development of a vision-targeted health-related quality of life item measure

Sylvia H. Paz Jerry Slotkin Roberta McKean-Cowdin Paul Lee Cynthia Owsley Susan Vitale Rohit Varma Richard Gershon Ron D. Hays 《Quality of life research》2013,22(9):2477-2487

Purpose

To develop a vision-targeted health-related quality of life (HRQOL) measure for the NIH Toolbox for the Assessment of Neurological and Behavioral Function.

Methods

We conducted a review of existing vision-targeted HRQOL surveys and identified color vision, low luminance vision, distance vision, general vision, near vision, ocular symptoms, psychosocial well-being, and role performance domains. Items in existing survey instruments were sorted into these domains. We selected non-redundant items and revised them to improve clarity and to limit the number of different response options. We conducted 10 cognitive interviews to evaluate the items. Finally, we revised the items and administered them to 819 individuals to calibrate the items and estimate the measure’s reliability and validity.

Results

The field test provided support for the 53-item vision-targeted HRQOL measure encompassing 6 domains: color vision, distance vision, near vision, ocular symptoms, psychosocial well-being, and role performance. The domain scores had high levels of reliability (coefficient alphas ranged from 0.848 to 0.940). Validity was supported by high correlations between National Eye Institute Visual Function Questionnaire scales and the new-vision-targeted scales (highest values were 0.771 between psychosocial well-being and mental health, and 0.729 between role performance and role difficulties), and by lower mean scores in those groups self-reporting eye disease (F statistic with p < 0.01 for all comparisons except cataract with ocular symptoms, psychosocial well-being, and role performance scales).

Conclusions

This vision-targeted HRQOL measure provides a basis for comprehensive assessment of the impact of eye diseases and treatments on daily functioning and well-being in adults. 相似文献

15.

Practical and philosophical issues surrounding a national item bank: if we build it will they come?

Revicki Dennis A. Sloan Jeff 《Quality of life research》2007,16(1):167-174

Item response theory (IRT), item banking and computer adaptive testing (CAT) methods have the potential to provide novel platforms for the collection, analysis and dissemination of patient data on health status and well-being. There are considerable challenges associated with building and maintaining a national item bank and it is uncertain whether there is sufficient interest among key stakeholders for IRT-based and CAT measures. The most convincing activity is demonstrating that the approach is feasible, psychometrically sound and useful in different specific applications. Demonstrated success opens up the possibility of more widespread acceptability and application. As part of the development effort, there needs to be continued meetings and discussion with psychometricians, instrument developers, clinical researchers, the FDA, pharmaceutical industry researchers and a managed care organizations about the advantages and disadvantages of a national item bank. 相似文献

16.

Development of computerized adaptive testing (CAT) for the EORTC QLQ-C30 physical functioning dimension

Morten Aa. Petersen Mogens Groenvold Neil K. Aaronson Wei-Chu Chie Thierry Conroy Anna Costantini Peter Fayers Jorunn Helbostad Bernhard Holzner Stein Kaasa Susanne Singer Galina Velikova Teresa Young 《Quality of life research》2011,20(4):479-490

Purpose

Computerized adaptive test (CAT) methods, based on item response theory (IRT), enable a patient-reported outcome instrument to be adapted to the individual patient while maintaining direct comparability of scores. The EORTC Quality of Life Group is developing a CAT version of the widely used EORTC QLQ-C30. We present the development and psychometric validation of the item pool for the first of the scales, physical functioning (PF).

Methods

Initial developments (including literature search and patient and expert evaluations) resulted in 56 candidate items. Responses to these items were collected from 1,176 patients with cancer from Denmark, France, Germany, Italy, Taiwan, and the United Kingdom. The items were evaluated with regard to psychometric properties.

Results

Evaluations showed that 31 of the items could be included in a unidimensional IRT model with acceptable fit and good content coverage, although the pool may lack items at the upper extreme (good PF). There were several findings of significant differential item functioning (DIF). However, the DIF findings appeared to have little impact on the PF estimation.

Conclusions

We have established an item pool for CAT measurement of PF and believe that this CAT instrument will clearly improve the EORTC measurement of PF. 相似文献

17.

Cross-cultural evaluation of health status using item response theory: FACT-B comparisons between Austrian and U.S. patients with breast cancer

Hahn EA Holzner B Kemmler G Sperner-Unterweger B Hudgens SA Cella D 《Evaluation & the health professions》2005,28(2):233-259

To make meaningful cross-cultural comparisons of health-related quality of life (HRQOL) or to pool international research data, it is essential to create culturally unbiased measures that detect clinically important differences between patients. We evaluated the measurement properties of the Functional Assessment of Cancer Therapy-Breast (FACT-B) in 111 Austrian and 144 U.S. patients with breast cancer using item response theory (IRT) methods. A small number of items were identified as displaying statistically significant differential item functioning (DIF), suggesting possible measurement bias. The majority of the items functioned similarly between the two cultural groups. U.S. patients reported lower (worse) physical function and well-being compared with Austrian patients, higher (better) social/family well-being and similar emotional well-being, before and after adjustment for DIF. IRT and related measurement models provide useful methods for assessing cross-cultural equivalence and determining which items can be pooled across languages before analyzing HRQOL data. Determination of clinically significant cross-cultural differences will require additional investigation. 相似文献

18.

Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS) 总被引：1，自引：0，他引：1

Rose M Bjorner JB Becker J Fries JF Ware JE 《Journal of clinical epidemiology》2008,61(1):17-33

OBJECTIVE: The Patient-Reported Outcomes Measurement Information System (PROMIS) was initiated to improve precision, reduce respondent burden, and enhance the comparability of health outcomes measures. We used item response theory (IRT) to construct and evaluate a preliminary item bank for physical function assuming four subdomains. STUDY DESIGN AND SETTING: Data from seven samples (N=17,726) using 136 items from nine questionnaires were evaluated. A generalized partial credit model was used to estimate item parameters, which were normed to a mean of 50 (SD=10) in the US population. Item bank properties were evaluated through Computerized Adaptive Test (CAT) simulations. RESULTS: IRT requirements were fulfilled by 70 items covering activities of daily living, lower extremity, and central body functions. The original item context partly affected parameter stability. Items on upper body function, and need for aid or devices did not fit the IRT model. In simulations, a 10-item CAT eliminated floor and decreased ceiling effects, achieving a small standard error (< 2.2) across scores from 20 to 50 (reliability >0.95 for a representative US sample). This precision was not achieved over a similar range by any comparable fixed length item sets. CONCLUSION: The methods of the PROMIS project are likely to substantially improve measures of physical function and to increase the efficiency of their administration using CAT. 相似文献

19.

Validation of a computer-adaptive test to evaluate generic health-related quality of life

Pablo Rebollo Ignacio Castejón Jesús Cuervo Guillermo Villa Eduardo García-Cueto Helena Díaz-Cuervo Pilar C Zardaín José Muñiz Jordi Alonso the Spanish CAT-Health Research Group 《Health and quality of life outcomes》2010,8(1):147

Background

Health Related Quality of Life (HRQoL) is a relevant variable in the evaluation of health outcomes. Questionnaires based on Classical Test Theory typically require a large number of items to evaluate HRQoL. Computer Adaptive Testing (CAT) can be used to reduce tests length while maintaining and, in some cases, improving accuracy. This study aimed at validating a CAT based on Item Response Theory (IRT) for evaluation of generic HRQoL: the CAT-Health instrument. 相似文献

20.

Methodological issues for building item banks and computerized adaptive scales

Thissen David Reeve Bryce B. Bjorner Jakob Bue Chang Chih-Hung 《Quality of life research》2007,16(1):109-119

This paper reviews important methodological considerations for developing item banks and computerized adaptive scales (commonly called computerized adaptive tests in the educational measurement literature, yielding the acronym CAT), including issues of the reference population, dimensionality, dichotomous versus polytomous response scales, differential item functioning (DIF) and conditional scoring, mode effects, the impact of local dependence, and innovative approaches to assessment using CATs in health outcomes research. 相似文献