首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The performance of two binary diagnostic tests is traditionally compared by their respective sensitivities and specificities. Other measures to describe the performance of a binary diagnostic test are likelihood ratios, defined as the ratio between the likelihood of a diagnostic test result in a group of diseased patients and the likelihood of a diagnostic test result in a group of non-diseased patients. In this study, we propose a method, based on the log-transformation of the ratio of the likelihood ratios, to compare the likelihood ratios of two binary diagnostic tests in paired designs. We have deduced hypothesis tests to compare the likelihood ratios and we have carried out simulation experiments to study the power and the type I error of the hypothesis tests deduced. We have also deduced a joint hypothesis test to simultaneously compare the likelihood ratios. The procedure used has been extended to the situation in which more than two binary diagnostic tests are applied to the same sample, and the situation in which two diagnostic tests with multilevel results are compared.  相似文献   

2.
BACKGROUND: To select a proper diagnostic test, it is recommended that the most specific test be used to confirm (rule in) a diagnosis, and the most sensitive test be used to establish that a disease is unlikely (rule out). These rule-in and rule-out concepts can also be characterized by the likelihood ratio (LR). However, previous papers discussed only the case of binary tests and assumed test results already known. METHODS: The author proposes using the 'Kullback-Leibler distance' as a new measure of rule-in/out potential. The Kullback-Leibler distance is an abstract concept arising from statistics and information theory. The author shows that it integrates in a proper way two sources of information--the distribution of test outcomes and the LR function. The index predicts the fate of an average subject before testing. RESULTS: Analysis of real and hypothetical data demonstrates its applications beyond binary tests. It works even when the conventional methods of dichotomization and ROC curve analysis fail. CONCLUSIONS: The Kullback-Leibler distance nicely characterizes the before-test rule-in/out potentials. It offers a new perspective from which to evaluate a diagnostic test.  相似文献   

3.
ObjectiveTo explain which measures of accuracy and which statistical methods should be used in studies to assess the value of a new binary test as a replacement test, an add-on test, or a triage test.Study Design and SettingSelection and explanation of statistical methods, illustrated with examples.ResultsStatistical methods for comparative diagnostic accuracy studies are described that take into account the purpose of the new diagnostic test. Methods are described within a framework that defines the major purpose of test comparison: assessing the value of a new test as a replacement test, an add-on test, or a triage test. Methods appropriate for both unpaired and paired study designs for binary test data are given, including regression modeling of diagnostic test accuracy. Implications for efficient study designs are also discussed.ConclusionsAppropriate selection of existing statistical methods is necessary to address research questions about the comparative accuracy of new tests.  相似文献   

4.
One of the most basic biostatistical problems is the comparison of two binary diagnostic tests. Commonly, one test will have greater sensitivity, and the other greater specificity. In this case, the choice of the optimal test generally requires a qualitative judgment as to whether gains in sensitivity are offset by losses in specificity. Here, we propose a simple decision analytic solution in which sensitivity and specificity are weighted by an intuitive parameter, the threshold probability of disease at which a patient will opt for treatment. This gives a net benefit that can be used to determine which of two diagnostic tests will give better clinical results at a given threshold probability and whether either is superior to the strategy of assuming that all or no patients have disease. We derive a simple formula for the relative diagnostic value, which is the difference in sensitivities of two tests divided by the difference in the specificities. We show that multiplying relative diagnostic value by the odds at the prevalence gives the odds of the threshold probability below which the more sensitive test is preferable and above which the more specific test should be chosen. The methodology is easily extended to incorporate combinations of tests and the risk or side effects of a test. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

5.
Sensitivity and specificity are classic parameters to assess and compare the performance of binary diagnostic tests versus a gold standard in a population. Another useful parameter to assess and compare the performance of binary tests is the weighted kappa coefficient, which is defined as a measure of the beyond‐chance agreement between the diagnostic test and the gold standard. In this study, we deduce the maximum likelihood estimators of the weighted kappa coefficients of multiple binary tests and we propose an asymptotic method to compare the weighted kappa coefficients of multiple binary tests with regard to the same gold standard when all of the diagnostic tests are applied to the same sample of patients. We have carried out simulation experiments to study the type I error and the power of the method that we propose when we compared three binary tests. We have applied the results obtained to the diagnosis of coronary disease. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

6.
The goal in diagnostic medicine is often to estimate the diagnostic accuracy of multiple experimental tests relative to a gold standard reference. When a gold standard reference is not available, investigators commonly use an imperfect reference standard. This paper proposes methodology for estimating the diagnostic accuracy of multiple binary tests with an imperfect reference standard when information about the diagnostic accuracy of the imperfect test is available from external data sources. We propose alternative joint models for characterizing the dependence between the experimental tests and discuss the use of these models for estimating individual‐test sensitivity and specificity as well as prevalence and multivariate post‐test probabilities (predictive values). We show using analytical and simulation techniques that, as long as the sensitivity and specificity of the imperfect test are high, inferences on diagnostic accuracy are robust to misspecification of the joint model. The methodology is demonstrated with a study examining the diagnostic accuracy of various HIV‐antibody tests for HIV. Published in 2008 by John Wiley & Sons, Ltd.  相似文献   

7.
Tests for equivalence or non-inferiority for paired binary data.   总被引:7,自引:0,他引:7  
Assessment of therapeutic equivalence or non-inferiority between two medical diagnostic procedures often involves comparisons of the response rates between paired binary endpoints. The commonly used and accepted approach to assessing equivalence is by comparing the asymptotic confidence interval on the difference of two response rates with some clinical meaningful equivalence limits. This paper investigates two asymptotic test statistics, a Wald-type (sample-based) test statistic and a restricted maximum likelihood estimation (RMLE-based) test statistic, to assess equivalence or non-inferiority based on paired binary endpoints. The sample size and power functions of the two tests are derived. The actual type I error and power of the two tests are computed by enumerating the exact probabilities in the rejection region. The results show that the RMLE-based test controls type I error better than the sample-based test. To establish an equivalence between two treatments with a symmetric equivalence limit of 0.15, a minimal sample size of 120 is needed. The RMLE-based test without the continuity correction performs well at the boundary point 0. A numerical example illustrates the proposed procedures.  相似文献   

8.
目的探讨两阶段两分类诊断试验存在证实偏倚时灵敏度和特异度的校正方法并进行比较。方法运用实例和模拟数据说明传统方法灵敏度和特异度的有偏估计,并运用最大似然估计和贝叶斯估计对灵敏度和特异度进行校正。结果传统方法计算的灵敏度和特异度存在偏倚,最大似然法和贝叶斯方法虽然均可校正估计偏倚,但是后者计算的可信区间比前者窄。结论在对诊断试验证实偏倚进行校正时,若有良好的先验信息,贝叶斯方法所获的结果更精确。  相似文献   

9.
We present methods for binomial regression when the outcome is determined using the results of a single diagnostic test with imperfect sensitivity and specificity. We present our model, illustrate it with the analysis of real data, and provide an example of WinBUGS program code for performing such an analysis. Conditional means priors are used in order to allow for inclusion of prior data and expert opinion in the estimation of odds ratios, probabilities, risk ratios, risk differences, and diagnostic test sensitivity and specificity. A simple method of obtaining Bayes factors for link selection is presented. Methods are illustrated and compared with Bayesian ordinary binary regression using data from a study of the effectiveness of a smoking cessation program among pregnant women. Regression coefficient estimates are shown to change noticeably when expert prior knowledge and imperfect sensitivity and specificity are incorporated into the model.  相似文献   

10.
We develop a simulation‐based procedure for determining the required sample size in binomial regression risk assessment studies when response data are subject to misclassification. A Bayesian average power criterion is used to determine a sample size that provides high probability, averaged over the distribution of potential future data sets, of correctly establishing the direction of association between predictor variables and the probability of event occurrence. The method is broadly applicable to any parametric binomial regression model including, but not limited to, the popular logistic, probit, and complementary log–log models. We detail a common medical scenario wherein ascertainment of true disease status is impractical or otherwise impeded, and in its place the outcome of a single binary diagnostic test is used as a surrogate. These methods are then extended to the two diagnostic test setting. We illustrate the method with categorical covariates using one example that involves screening for human papillomavirus. This example coupled with results from simulated data highlights the utility of our Bayesian sample size procedure with error prone measurements. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

11.
A sequential design is proposed to test whether the accuracy of a binary diagnostic biomarker meets the minimal level of acceptance. The accuracy of a binary diagnostic biomarker is a linear combination of the marker's sensitivity and specificity. The objective of the sequential method is to minimize the maximum expected sample size under the null hypothesis that the marker's accuracy is below the minimal level of acceptance. The exact results of two‐stage designs based on Youden's index and efficiency indicate that the maximum expected sample sizes are smaller than the sample sizes of the fixed designs. Exact methods are also developed for estimation, confidence interval and p‐value concerning the proposed accuracy index upon termination of the sequential testing. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.  相似文献   

12.
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.  相似文献   

13.
In non-randomized clinical studies, the regression phenomenon can confound interpretation of the effectiveness of an intervention. The regression effect arises due to daily variation and/or misclassification of the biologic marker used in selection as well as in the assessment of the intervention effect. We consider a scenario in which the selection criterion for a subject's participation in the study is such that he/she must have a positive diagnostic test at screening. The disease status is then reassessed at the end of intervention. Thus, two repeated measurements of a binary disease outcome are available, with only selected subjects having a second measurement upon follow-up. We propose methods for estimating the change in event probability resulting from implementing the intervention while adjusting for the misclassification that produces the regression effect. We extend this approach to estimation of both the placebo and intervention effects in placebo-controlled studies designed with a misclassified binary outcome. Analyses of two biomedical studies are used for illustration.  相似文献   

14.
From the patients’ management perspective, a good diagnostic test should contribute to both reflecting the true disease status and improving clinical outcomes. The diagnostic randomized clinical trial is designed to combine both diagnostic tests and therapeutic interventions. Evaluation of diagnostic tests is carried out with therapeutic outcomes as the primary endpoint rather than test accuracy. We lay out the probability framework for evaluating such trials. We compare two commonly referred designs—the two‐arm design and the paired design—in a formal statistical hypothesis testing setup and identify the causal connection between the two tests. The paired design is shown to be more efficient than the two‐arm design. The efficiency gains vary depending on the discordant rates of test results. We derive sample size formulas for both binary and continuous endpoints. We derive estimation of important quantities under the paired design and also conduct simulation studies to verify the theoretical results. We illustrate the method with an example of designing a randomized study on preoperative staging of bladder cancer. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

15.
For the meta-analysis of controlled clinical trials with binary outcome a test statistic for testing an overall treatment effect is proposed, which is based on a refined estimator for the variance of the treatment effect estimator usually used in the random-effects model of meta-analysis. In simulation studies it is shown that the proposed test keeps the prescribed significance level much better than the commonly used tests in the fixed-effects and random-effects model, respectively. Moreover, when using the test it is not necessary to choose between fixed effects and random effects approaches in advance. The proposed method applies in the same way to the analysis of a controlled multi-centre study with binary outcome, including a possible interaction between drugs and centres.  相似文献   

16.
Applications of latent class analysis in diagnostic test studies have assumed that all tests are measuring a common binary latent variable, the true disease status. In this article we describe a new approach that recognizes that tests based on different biological phenomena measure different latent variables, which in turn measure the latent true disease status. This allows for adjustment of conditional dependence between tests within disease categories. The model further allows for the inclusion of measured covariates and unmeasured random effects affecting test performance within latent classes. We describe a Bayesian approach for model estimation and describe a new posterior predictive check for evaluating candidate models. The methods are motivated and illustrated by results from a study of diagnostic tests for Chlamydia trachomatis. Published in 2008 by John Wiley & Sons, Ltd.  相似文献   

17.
Binocular data typically arise in ophthalmology where pairs of eyes are evaluated, through some diagnostic procedure, for the presence of certain diseases or pathologies. Treating eyes as independent and adopting the usual approach in estimating the sensitivity and specificity of a diagnostic test ignores the correlation between eyes. This may consequently yield incorrect estimates, especially of the standard errors. The paper proposes a likelihood-based method of accounting for the correlations between eyes and estimating sensitivity and specificity using a model for binocular or paired binary outcomes. Estimation of model parameters via maximum likelihood is outlined and approximate tests are provided. The efficiency of the estimates is assessed in a simulation study. An extension of the methodology to the case of several diagnostic tests, or the same test measured on several occasions, which arises in multi-reader studies, is given. A further extension to the case of multiple diseases is outlined as well. Data from a study on diabetic retinopathy are analysed to illustrate the methodology.  相似文献   

18.
A method is described for modeling the sensitivity, specificity, and positive and negative predictive values of a diagnostic test. To model sensitivity and specificity, the dependent variable (Y) is defined to be the dichotomous results of the screening test, and the presence or absence of disease, as defined by the "gold standard", is included as a binary explanatory variable (X1), along with variables used to define the subgroups of interest. The sensitivity of the screening test may then be estimated using logistic regression procedures. Modeled estimates of the specificity and predictive values of the screening test may be similarly derived. Using data from a population-based study of peripheral arterial disease, the authors demonstrated empirically that this method may be useful for obtaining smoothed estimates of sensitivity, specificity, and predictive values. As an extension of this method, an approach to the modeling of the relative sensitivity of two screening tests is described, using data from a study of screening procedures for colorectal disease as an example.  相似文献   

19.
目的利用金标准为等级变量时诊断试验的评价方法 ,评价氧化低密度脂蛋白ELISA检测试剂盒在冠心病诊断中的诊断价值。方法共入选1190例观察对象,根据金标准检测结果分为3个不同状态,从ROC曲线下面积的定义出发,利用R软件,获得金标准为等级变量时氧化低密度脂蛋白ROC曲线下面积的非参数估计值。结果无论国产或瑞士产氧化低密度脂蛋白试剂盒,均说明氧化低密度脂蛋白对冠心病的不同疾病或健康状态具有区分能力,与AUC=0.5比较均具有统计学意义(P<0.001),且随着状态间的差距越大,其区分能力增强。两试剂盒与金标准相比的准确性分别为0.8797和0.8883,且具有统计学意义(P<0.001)。结论本研究为类似的研究提供了方法学参考。  相似文献   

20.
Effectively combining many classification instruments or diagnostic measurements together to improve the classification accuracy of individuals is a common idea in disease diagnosis or classification. These ensemble‐type diagnostic methods can be constructed with respect to different kinds of performance criterions. Among them, the receiver operating characteristic (ROC) curve is the most popular criterion, which, together with some indexes derived from it, is commonly used to evaluate and summarize the performance of a classification instrument, such as a biomarker or a classifier. However, the usefulness of ROC curve and its related indexes relies on the existence of a binary label for each individual subject. In many disease diagnosis situations, such a binary variable may not exist, but only the continuous measurement of the true disease status is available. This true disease status is often referred to as the ‘gold standard’. The modified area under ROC curve (AUC)‐type measure defined by Obuchowski is a method proposed to accommodate such a situation. However, there is still no method for finding the optimal combination of diagnostic measurements, with respect to such an index, to have better diagnostic power than that of each individual measurement. In this paper, we propose an algorithm for finding the optimal combination with respect to such an extended AUC‐type measure such that the combined measurement can have more diagnostic power. We illustrate the performance of our algorithm by using some synthesized data and a diabetes data set. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号