首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 125 毫秒
1.
目的:对MMPI偏执分量表的CAT研究的可行性进行探索,并在普通人群和病人中检验CAT测验的结果。方法:首先采用项目反应理论对Pa分量表的单维性进行检验,其次依据项目反应理论挑选项目,形成用于CAT的删减版Pa-D,最后在病人和普通人中验证CAT的可行性。结果:①Pa和Pa-D都能满足单维性的假设,可以用IRT分析项目。②Pa-D的项目参数在合理可接受的范围内。结论:Pa和Pa-D均满足单维性,且删除题目未对测验的信息造成明显的影响。采用后验CAT模拟可以节省一半的题目。  相似文献   

2.
目的:检验艾森克个性问卷(成人版)计算机自适应测验(EPQ-A-CAT)在实际应用中的测量精准度、效度和测试效率。方法:选取204名大学生被试完成了EPQ-A-CAT的测试,其中120名被试接受了CAT和纸笔两种版式的施测,并对31名被试进行了CAT版式的重测。结果:CAT版式的估计标准误均值在四个分量表中都小于终止规则预设的最小误差值;CAT重测的相关系数分别为0.79(E)、0.85(N)、0.67(P)、0.71(L);CAT版式总体能够节省一半以上的项目,各分量表节省了70%-30%的项目;两个版本测试结果的积差相关系数在E、N、P、L四个分量表中分别为0.85、0.81、0.75和0.74。结论:研究结果证明了EPQ-A-CAT版式是高效、可靠和有效的测量工具。  相似文献   

3.
动作稳定性纸笔测验的设计及信效度检验   总被引:1,自引:0,他引:1  
目的:设计手部动作稳定性的纸笔测验并对其进行信、效度检验。方法:参照九洞仪的测量原理设计测量手部动作稳定性的纸笔测验。该测验主要由测验专用纸和笔组成。测验纸上印有9种宽度不同的条形线框,线框宽度的设置与九洞仪的洞的直径大小相对应,每种宽度的线框均有三个,共27个线框。笔为软头签字笔。用该测验对125名被试进行测量,其中以66名被试为对象进行重侧信度检验;以59名被试为对象与九洞仪测验结果进行一致性检验。结果:纸笔测验的重侧信度为0.70(P=0.000);纸笔测验分数与九洞仪测验分数的相关系数为0.65(P=0.000)。结论:新编动作稳定性纸笔测验与九洞仪测验结果具有较高的一致性,可用于手部动作稳定性的测量。  相似文献   

4.
目的:编制职业过劳测验.方法:将职业过劳测验、艾森克人格问卷简式量表中国版、明尼苏达工作满意度问卷-短式修订版、一般自我效能感量表对162名销售人员进行测量.结果:职业过劳测验包含精疲力竭和焦虑两个因素,它们与人格、工作满意度、自我效能感和销售业绩排名之间存在复杂的关系.结论:职业过劳测验信度、效度较好,适用于实际测量.  相似文献   

5.
目的分析医务人员艾森克人格问卷及树木人格投射测验的绘画特征,为医务人员心理干预提供依据。方法使用艾森克人格问卷和树木人格投射测验对北京市三级、二级医院302名医务人员实施测量,进行常模比较和相关分析。结果与华北地区常模相比,医务人员表现出低纯朴性(男:t=4.12,P<0.01;女:t=9.36,P<0.01),高精神质(女:t=2.18,P<0.05)的人格特征;医务人员在树木人格投射测验的树冠、茂盛度、树枝等9类指标中表现出攻击性特征且与P量表分呈显著正相关(r=0.40,P<0.01);P分量表高分者与低分者攻击性差异显著(t=23.62,P<0.01)。结论树木人格测验在医务人员人格测评中有较高的应用价值,医务人员具有高精神质人格特征,在树木投射测验中表现出高攻击性等特征。  相似文献   

6.
目的:引进Deshong等编制的五因素边缘人格量表(简版),在中国大学生样本中进行中文版信效度检验。方法:采取整群取样的方法对942名大学生进行集体施测,选取人格诊断问卷、症状自评量表、艾森克人格问卷和Barratt冲动量表作为效标效度工具。结果:量表的验证性因素分析结果良好:χ~2/df=2.89,RMSEA=0.06,SRMR=0.05,CFI=0.96,RFI=0.94,IFI=0.96;量表与人格诊断问卷边缘型人格障碍分量表、症状自评量表、艾森克人格问卷神经质分量表和Barratt冲动量表的相关系数分别为0.73,0.72,0.70,0.55(P0.01);量表总分的内部一致性系数为0.96,分半信度为0.93,4周后的重测信度为0.77(n=74),各分量表的内部一致性系数在0.52~0.84之间。结论:五因素边缘人格量表(简版)中文版具有较好信效度,可作为测量边缘人格特质的有效工具。  相似文献   

7.
基于IRT的三年级数学成就测验的编制   总被引:1,自引:0,他引:1  
目的:尝试在项目反应理论的指导下,编制三年级数学成就测验,为学科学习的评价提供一辅助工具。方法:用贝佳方法检验试测数据单维性,利用ANOTE软件估计项目参数,采用铆测验设计进行参数等值。结果:四套试卷均符合单维性要求,基本拟合三参数逻辑斯蒂克模型,项目拟合度都在90%左右,最终选取52个项目组成正式测验。结论:正式测验的信息量达到31.1548,估计标准误为0.1267,符合项目反应理论的要求。  相似文献   

8.
大学生在人格问卷测谎量表上的得分与反应时的关系   总被引:1,自引:0,他引:1  
目的:探索大学生在人格问卷测谎量表上的得分与反应时之间的关系,并且验证测谎量表得分与其他人格维度的关系。方法:采用“艾森克人格问卷”和“明尼苏达多相人格测试”中的测谎量表,对132名高校学生进行测查。结果:大学生在测谎量表上的得分与反应时间存在正相关(P〈0.05),测谎量表得分高的被试反应时间显著长于得分低的被试的反应时间(P〈0.01)。此外,测谎量表的得分与神经质存在负相关(P〈0.01),与精神质存在负相关(P〈0.01)。结论:本研究发现了反应时与测谎量表得分的关系.为直接测量手段引入传统量表测验做了一定的铺垫。  相似文献   

9.
目的通过分析人格测验正反向题目的时间效应,探讨人格测验中题目陈述方式对测验效度及质量的影响,为测验编制及研究提供参考。方法1230名士兵完成计算机呈现的艾森克人格问卷,记录被试完成每个题目所需要的时间,将正反向题目平均用时进行对比分析。结果反向题目整体平均用时显著高于正向题目平均用时(P<0.01);得分的反向题目平均用时显著低于得分的正向题目(P<0.01);未得分的反向题目平均用时显著高于未得分的正向题目(P<0.01)。结论反向题目的引入可以减少人格测验的反应偏差,提高测验的效度,但也会在一定程度上影响被试作答的心理过程。因此在测验编制的过程中应平衡使用正反向题目,但反向题目的语言要尽量简洁明了,减少对被试作答的影响。  相似文献   

10.
目的:编制中学心理健康教育满意度评价问卷(SMHESES)并检验其信效度。方法:以Lee等的满意度单因素模型为基础,通过访谈、专家讨论及条目分析形成10个条目的问卷。采用网络化问卷调查与传统纸笔问卷相结合的方式,方便选取湖北与福建两省19所中学907名中学生(其中网络调查471名),随机分为两部分,一部分(n=453)进行探索性因素分析,另一部分(n=454)进行验证性因素分析,使用多组结构方程模式考察问卷的收敛效度和跨样本效度,并检验与中国中学生心理健康量表(MMHI-60)的关联。结果:探索性因素分析提取了单一因子,累计贡献率为57.51%,项目负荷系数在0.62~0.83之间。验证性因素分析检验了结构的有效性(χ2=124.06,CFI=0.96,NFI=0.94,RFI=0.92,IFI=0.96,TLI=0.94,RM SEA=0.07)。问卷及所有测量题目具有测量系数恒等性(χ2=165.09,CFI=0.94,NFI=0.92,RFI=0.94,IFI=0.94,TLI=0.94,RM SEA=0.08),在不同样本(男女生、纸质与网络)中模型具有较好的不变性。问卷的Cronbachα系数为0.91,1个月后重测信度为0.84。问卷总分与MMHI-60总分的相关系数为-0.31,与MMHI-60各维度的相关系数在-0.19~-0.30之间(均P<0.01)。结论:中学心理健康教育满意度评价问卷具有较好的信效度,纸笔测验与网络化测验具有等效性,但仍需要扩大样本进一步深入检验。  相似文献   

11.
ABSTRACT: BACKGROUND: Computerized adaptive testing (CAT) is being applied to health outcome measures developed as paper-and-pencil (P&P) instruments. Differences in how respondents answer items administered by CAT vs. P&P can increase error in CAT-estimated measures if not identified and corrected. METHOD: Two methods for detecting item-level mode effects are proposed using Bayesian estimation of posterior distributions of item parameters: (1) a modified robust Z (RZ) test, and (2) 95% credible intervals (CrI) for the CAT-P&P difference in item difficulty. A simulation study was conducted under the following conditions: (1) data-generating model (one- vs. twoparameter IRT model); (2) moderate vs. large DIF sizes; (3) percentage of DIF items (10% vs. 30%), and (4) mean difference in theta estimates across modes of 0 vs. 1 logits. This resulted in a total of 16 conditions with 10 generated datasets per condition. RESULTS: Both methods evidenced good to excellent false positive control, with RZ providing better control of false positives and with slightly higher power for CrI, irrespective of measurement model. False positives increased when items were very easy to endorse and when there with mode differences in mean trait level. True positives were predicted by CAT item usage, absolute item difficulty and item discrimination. RZ outperformed CrI, due to better control of false positive DIF. CONCLUSIONS: Whereas false positives were well controlled, particularly for RZ, power to detect DIF was suboptimal. Research is needed to examine the robustness of these methods under varying prior assumptions concerning the distribution of item and person parameters and when data fail to conform to prior assumptions. False identification of DIF when items were very easy to endorse is a problem warranting additional investigation.  相似文献   

12.
This article reports on the development of short forms from the Patient-Reported Outcomes Measurement Information System (PROMIS?) Sleep Disturbance (SD) and Sleep-Related Impairment (SRI) item banks. Results from post-hoc computerized adaptive testing (CAT) simulations, item discrimination parameters, item means, and clinical judgments were used to select the best-performing 8 items for SD and SRI. The final 8-item short forms provided less test information than the corresponding full banks, but correlated strongly with the longer forms. The short forms had greater measurement precision than the Pittsburgh Sleep Quality Index (PSQI) and the Epworth Sleepiness Scale (ESS), as indicated by larger test information values across the continuum of severity, despite having fewer total items--a major advantage for both research and clinical settings.  相似文献   

13.
BACKGROUND: Medical history taking as well as Chlamydia antibody titre (CAT) testing are currently used in the selection of patients for diagnostic laparoscopy with tubal patency testing. Most research has focused on the predictive value of CAT in isolation from medical history. We assessed therefore whether the combination of medical history and CAT improves the efficiency of selecting patients for laparoscopy as compared to the use of either medical history or CAT. METHODS: Data of 207 consecutive subfertile women were used to create multivariable logistic regression models for the prediction of tubal disease as diagnosed by diagnostic laparoscopy. RESULTS: The model with data of medical history only had an area under the receiver operating characteristic curve (AUC) of 0.65 (95% CI 0.56-0.74). Addition of CAT increased the AUC to 0.70 (95% CI 0.62-0.78) (P = 0.065). CAT was positive in 40 women and showed a sensitivity of 0.37 (95% CI 0.26-0.49) for a specificity of 0.88 (95% CI 0.82-0.93). In CAT positive women, a blank medical history did not decrease the probability of tubal disease. Of the 167 women tested CAT negative, 23 (14%) still had a high probability of disease due to their medical history and 11 of them (48%) showed tubal abnormalities on diagnostic laparoscopy. CONCLUSIONS: CAT testing adds valuable information to a woman's risk profile based on her medical history. The combination of medical history taking and CAT testing has a better yield for diagnosing tubal disease than either of these alone.  相似文献   

14.
BACKGROUND: For the evaluation of tubal function, Chlamydia antibody testing (CAT) has been introduced as a screening test. We compared six CAT screening strategies (five CAT tests and one combination of tests), with respect to their cost-effectiveness, by using IVF pregnancy rate as outcome measure. METHODS: A decision analytic model was developed based on a source population of 1715 subfertile women. The model incorporates hysterosalpingography (HSG), laparoscopy and IVF. To calculate IVF pregnancy rates, costs, effects, cost-effectiveness and incremental costs per effect of the six different CAT screening strategies were determined. RESULTS: pELISA Medac turned out to be the most cost-effective CAT screening strategy (15 075 per IVF pregnancy), followed by MIF Anilabsystems (15 108). A combination of tests (pELISA Medac and MIF Anilabsystems; 15 127) did not improve the cost-effectiveness of the single strategies. Sensitivity analyses showed that the results are robust for changes in the baseline values of the model parameters. CONCLUSIONS: Only small differences were found between the screening strategies regarding the cost-effectiveness, although pELISA Medac was the most cost-effective strategy. Before introducing a particular CAT test into clinical practice, one should consider the effects and consequences of the entire screening strategy, instead of only the diagnostic accuracy of the test used.  相似文献   

15.
The BioVue column agglutination technology (CAT) was evaluated simultaneously with standard tube test (STT) methodology for use in indirect antiglobulin testing (IAT) and direct antiglobulin testing (DAT). One thousand thirty-five blood specimens were used for the IAT comparison, and 44 blood specimens were used for the DAT comparison. Both polyspecific antiglobulin and anti- IgG antiglobulin reagents were used in the tube testing and the CAT testing. For IAT, sensitivity was 100 percent for CAT and 99.6 percent for STT; sensitivity was 97.9 percent for CAT and 100 percent for STT. In addition, a 67 percent labor savings was realized with CAT versus STT. Specificity and sensitivity of both methodologies were 100 percent for the DAT. BioVue proved to be a reliable and efficient alternative to standard test tube methods for doing IATs and DATs.  相似文献   

16.

Background

The Internet is used increasingly for both suicide research and prevention. To optimize online assessment of suicidal patients, there is a need for short, good-quality tools to assess elevated risk of future suicidal behavior. Computer adaptive testing (CAT) can be used to reduce response burden and improve accuracy, and make the available pencil-and-paper tools more appropriate for online administration.

Objective

The aim was to test whether an item response–based computer adaptive simulation can be used to reduce the length of the Beck Scale for Suicide Ideation (BSS).

Methods

The data used for our simulation was obtained from a large multicenter trial from The Netherlands: the Professionals in Training to STOP suicide (PITSTOP suicide) study. We applied a principal components analysis (PCA), confirmatory factor analysis (CFA), a graded response model (GRM), and simulated a CAT.

Results

The scores of 505 patients were analyzed. Psychometric analyses showed the questionnaire to be unidimensional with good internal consistency. The computer adaptive simulation showed that for the estimation of elevation of risk of future suicidal behavior 4 items (instead of the full 19) were sufficient, on average.

Conclusions

This study demonstrated that CAT can be applied successfully to reduce the length of the Dutch version of the BSS. We argue that the use of CAT can improve the accuracy and the response burden when assessing the risk of future suicidal behavior online. Because CAT can be daunting for clinicians and applied scientists, we offer a concrete example of our computer adaptive simulation of the Dutch version of the BSS at the end of the paper.  相似文献   

17.
BackgroundQuality of life (QoL) questionnaires are desirable for clinical practice but can be time-consuming to administer and interpret, making their widespread adoption difficult.ObjectiveOur aim was to assess the performance of the World Health Organization Quality of Life (WHOQOL)-100 questionnaire as four item banks to facilitate adaptive testing using simulated computer adaptive tests (CATs) for physical, psychological, social, and environmental QoL.MethodsWe used data from the UK WHOQOL-100 questionnaire (N=320) to calibrate item banks using item response theory, which included psychometric assessments of differential item functioning, local dependency, unidimensionality, and reliability. We simulated CATs to assess the number of items administered before prespecified levels of reliability was met.ResultsThe item banks (40 items) all displayed good model fit (P>.01) and were unidimensional (fewer than 5% of t tests significant), reliable (Person Separation Index>.70), and free from differential item functioning (no significant analysis of variance interaction) or local dependency (residual correlations < +.20). When matched for reliability, the item banks were between 45% and 75% shorter than paper-based WHOQOL measures. Across the four domains, a high standard of reliability (alpha>.90) could be gained with a median of 9 items.ConclusionsUsing CAT, simulated assessments were as reliable as paper-based forms of the WHOQOL with a fraction of the number of items. These properties suggest that these item banks are suitable for computerized adaptive assessment. These item banks have the potential for international development using existing alternative language versions of the WHOQOL items.  相似文献   

18.
Psychometric properties of the v1.0 Patient-Reported Outcomes Measurement Information System (PROMIS®) sleep disturbance (27 items) and sleep-related impairment (SRI; 16 items) item banks, short forms derived from the item bank, and simulated computerised adaptive test (CAT), were assessed in a representative sample of 1,006 adults from the Dutch general population. For sleep disturbance all items fitted the item response theory model. Four items showed differential item functioning (i.e., lack of measurement invariance) for age and two for language but the impact on scores (expressed as T-scores) was small. Reliable scores (r > 0.90) were found for 92.2%–96.3% of respondents with the full bank, short forms with six and eight items, and CAT, but for only 25.6% with the four-item short form. For SRI two items did not fit the item response theory model. Four items showed differential item functioning for language but the impact on T-scores was small. Reliable scores were found for 82.1% with the full bank, for 47.8%–69.5% with short forms and CAT. T-scores of 49.7 and 49.3 represent the average score of the Dutch general population for sleep disturbance and SRI, respectively. In conclusion, sufficient structural validity, reliability, and cross-cultural validity was found for the full banks but short forms of four items are not reliable enough for clinical practice. For SRI we recommend the full item bank if this is the primary outcome.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号