首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Receiver operating characteristic (ROC) curves can be used to assess the accuracy of tests measured on ordinal or continuous scales. The most commonly used measure for the overall diagnostic accuracy of diagnostic tests is the area under the ROC curve (AUC). A gold standard (GS) test on the true disease status is required to estimate the AUC. However, a GS test may sometimes be too expensive or infeasible. Therefore, in many medical research studies, the true disease status of the subjects may remain unknown. Under the normality assumption on test results from each disease group of subjects, using the expectation‐maximization (EM) algorithm in conjunction with a bootstrap method, we propose a maximum likelihood‐based procedure for the construction of confidence intervals for the difference in paired AUCs in the absence of a GS test. Simulation results show that the proposed interval estimation procedure yields satisfactory coverage probabilities and interval lengths. The proposed method is illustrated with two examples. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

2.
ObjectivesStudies to evaluate clinical screening tests often face the problem that the “gold standard” diagnostic approach is costly and/or invasive. It is therefore common to verify only a subset of negative screening tests using the gold standard method. However, undersampling the screen negatives can lead to substantial overestimation of the sensitivity and underestimation of the specificity of the diagnostic test. Our objective was to develop a simple and accurate statistical method to address this “verification bias.”Study Design and SettingWe developed a weighted generalized estimating equation approach to estimate, in a single model, the accuracy (eg, sensitivity/specificity) of multiple assays and simultaneously compare results between assays while addressing verification bias. This approach can be implemented using standard statistical software. Simulations were conducted to assess the proposed method. An example is provided using a cervical cancer screening trial that compared the accuracy of human papillomavirus and Pap tests, with histologic data as the gold standard.ResultsThe proposed approach performed well in estimating and comparing the accuracy of multiple assays in the presence of verification bias.ConclusionThe proposed approach is an easy to apply and accurate method for addressing verification bias in studies of multiple screening methods.  相似文献   

3.
Clinical studies of predictive diagnostic tests consider the evaluation of a single test and comparison of two tests regarding their predictive accuracy of disease status. The positive predictive value (PPV) curve is used for assessing the probability of predicting the disease given a positive test result. The sequential property of one PPV curve had been studied. However, in later stages of diagnostic test development, it is more interesting to compare predictive accuracy of two tests. In this article, we propose a group sequential test for the comparison of PPV curves for paired designs when both diagnostic tests are applied to the same subject. We first derive asymptotic properties of the sequential differences of two correlated empirical PPV curves under the common case-control sampling. We then apply these results to develop a group sequential test procedure. The asymptotic results are also critical for deriving both the optimal sample size ratio and minimal required sample sizes for the proposed procedure. Our simulation studies show that the proposed sequential testing maintains the nominal type I error rate in finite samples. The proposed design is illustrated in a hypothetical lung cancer predictive trial and in a cancer diagnostic trial.  相似文献   

4.
In this paper, we develop methods to combine multiple biomarker trajectories into a composite diagnostic marker using functional data analysis (FDA) to achieve better diagnostic accuracy in monitoring disease recurrence in the setting of a prospective cohort study. In such studies, the disease status is usually verified only for patients with a positive test result in any biomarker and is missing in patients with negative test results in all biomarkers. Thus, the test result will affect disease verification, which leads to verification bias if the analysis is restricted only to the verified cases. We treat verification bias as a missing data problem. Under both missing at random (MAR) and missing not at random (MNAR) assumptions, we derive the optimal classification rules using the Neyman-Pearson lemma based on the composite diagnostic marker. We estimate thresholds adjusted for verification bias to dichotomize patients as test positive or test negative, and we evaluate the diagnostic accuracy using the verification bias corrected area under the ROC curves (AUCs). We evaluate the performance and robustness of the FDA combination approach and assess the consistency of the approach through simulation studies. In addition, we perform a sensitivity analysis of the dependency between the verification process and disease status for the approach under the MNAR assumption. We apply the proposed method on data from the Religious Orders Study and from a non-small cell lung cancer trial.  相似文献   

5.
ROC curves and summary measures of accuracy derived from them, such as the area under the ROC curve, have become the standard for describing and comparing the accuracy of diagnostic tests. Methods for estimating ROC curves rely on the existence of a gold standard which dichotomizes patients into disease present or absent. There are, however, many examples of diagnostic tests whose gold standards are not binary-scale, but rather continuous-scale. Unnatural dichotomization of these gold standards leads to bias and inconsistency in estimates of diagnostic accuracy. In this paper, we propose a non-parametric estimator of diagnostic test accuracy which does not require dichotomization of the gold standard. This estimator has an interpretation analogous to the area under the ROC curve. We propose a confidence interval for test accuracy and a statistical test for comparing accuracies of tests from paired designs. We compare the performance (i.e. CI coverage, type I error rate, power) of the proposed methods with several alternatives. An example is presented where the accuracies of two quick blood tests for measuring serum iron concentrations are estimated and compared.  相似文献   

6.
The goal in diagnostic medicine is often to estimate the diagnostic accuracy of multiple experimental tests relative to a gold standard reference. When a gold standard reference is not available, investigators commonly use an imperfect reference standard. This paper proposes methodology for estimating the diagnostic accuracy of multiple binary tests with an imperfect reference standard when information about the diagnostic accuracy of the imperfect test is available from external data sources. We propose alternative joint models for characterizing the dependence between the experimental tests and discuss the use of these models for estimating individual‐test sensitivity and specificity as well as prevalence and multivariate post‐test probabilities (predictive values). We show using analytical and simulation techniques that, as long as the sensitivity and specificity of the imperfect test are high, inferences on diagnostic accuracy are robust to misspecification of the joint model. The methodology is demonstrated with a study examining the diagnostic accuracy of various HIV‐antibody tests for HIV. Published in 2008 by John Wiley & Sons, Ltd.  相似文献   

7.
目的利用金标准为等级变量时诊断试验的评价方法 ,评价氧化低密度脂蛋白ELISA检测试剂盒在冠心病诊断中的诊断价值。方法共入选1190例观察对象,根据金标准检测结果分为3个不同状态,从ROC曲线下面积的定义出发,利用R软件,获得金标准为等级变量时氧化低密度脂蛋白ROC曲线下面积的非参数估计值。结果无论国产或瑞士产氧化低密度脂蛋白试剂盒,均说明氧化低密度脂蛋白对冠心病的不同疾病或健康状态具有区分能力,与AUC=0.5比较均具有统计学意义(P<0.001),且随着状态间的差距越大,其区分能力增强。两试剂盒与金标准相比的准确性分别为0.8797和0.8883,且具有统计学意义(P<0.001)。结论本研究为类似的研究提供了方法学参考。  相似文献   

8.
Li CR  Liao CT  Liu JP 《Statistics in medicine》2008,27(10):1762-1776
Non-inferiority is a reasonable approach to assessing the diagnostic accuracy of a new diagnostic test if it provides an easier administration or reduces the cost. The area under the receiver operating characteristic (ROC) curve is one of the common measures for the overall diagnostic accuracy. However, it may not differentiate the various shapes of the ROC curves with different diagnostic significances. The partial area under the ROC curve (PAUROC) may present an alternative that can provide additional and complimentary information for some diagnostic tests which require false-positive rate that does not exceed a certain level. Non-parametric and maximum likelihood methods can be used for the non-inferiority tests based on the difference in paired PAUROCs. However, their performance has not been investigated in finite samples. We propose to use the concept of generalized p-value to construct a non-inferiority test for diagnostic accuracy based on the difference in paired PAUROCs. Simulation results show that the proposed non-inferiority test not only adequately controls the size at the nominal level but also is uniformly more powerful than the non-parametric methods. The proposed method is illustrated with a numerical example using published data.  相似文献   

9.
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve is a useful tool to assess the ability of a diagnostic test to discriminate between two classes or groups. In practice, multiple diagnostic tests or biomarkers are combined to improve diagnostic accuracy. Often, biomarker measurements are undetectable either below or above the so‐called limits of detection (LoD). In this paper, nonparametric predictive inference (NPI) for best linear combination of two or more biomarkers subject to limits of detection is presented. NPI is a frequentist statistical method that is explicitly aimed at using few modelling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. The NPI lower and upper bounds for the ROC curve subject to limits of detection are derived, where the objective function to maximize is the area under the ROC curve. In addition, the paper discusses the effect of restriction on the linear combination's coefficients on the analysis. Examples are provided to illustrate the proposed method. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

10.
When multiple imperfect dichotomous diagnostic tests are applied to an individual, it is possible that some or all of their results remain dependent even after conditioning on the true disease status. The estimates could be biased if this conditional dependence is ignored when using the test results to infer about the prevalence of a disease or the accuracies of the diagnostic tests. However, statistical methods correcting for this bias by modelling higher‐order conditional dependence terms between multiple diagnostic tests are not well addressed in the literature. This paper extends a Bayesian fixed effects model for 2 diagnostic tests with pairwise correlation to cases with 3 or more diagnostic tests with higher order correlations. Simulation results show that the proposed fixed effects model works well both in the case when the tests are highly correlated and in the case when the tests are truly conditionally independent, provided adequate external information is available in the form of fixed constraints or prior distributions. A data set on the diagnosis of childhood pulmonary tuberculosis is used to illustrate the proposed model.  相似文献   

11.
In many medical applications, combining information from multiple biomarkers could yield a better diagnosis than any single one on its own. When there is a lack of a gold standard, an algorithm of classifying subjects into the case and non‐case status is necessary for combining multiple markers. The aim of this paper is to develop a method to construct a composite test from multiple applicable tests and derive an optimal classification rule under the absence of a gold standard. Rather than combining the tests, we treat the tests as a sequence. This sequential composite test is based on a mixture of two multivariate normal latent models for the distribution of the test results in case and non‐case groups, and the optimal classification rule is derived returning the greatest sensitivity at a given specificity. This method is applied to a real‐data example and simulation studies have been carried out to assess the statistical properties and predictive accuracy of the proposed composite test. This method is also attainable to implement nonparametrically. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

12.
The area under the receiver operating characteristic (ROC) curve (AUC) is used as a performance metric for quantitative tests. Although multiple biomarkers may be available for diagnostic or screening purposes, diagnostic accuracy is often assessed individually rather than in combination. In this paper, we consider the interesting problem of combining multiple biomarkers for use in a single diagnostic criterion with the goal of improving the diagnostic accuracy above that of an individual biomarker. The diagnostic criterion created from multiple biomarkers is based on the predictive probability of disease, conditional on given multiple biomarker outcomes. If the computed predictive probability exceeds a specified cutoff, the corresponding subject is allocated as ‘diseased’. This defines a standard diagnostic criterion that has its own ROC curve, namely, the combined ROC (cROC). The AUC metric for cROC, namely, the combined AUC (cAUC), is used to compare the predictive criterion based on multiple biomarkers to one based on fewer biomarkers. A multivariate random‐effects model is proposed for modeling multiple normally distributed dependent scores. Bayesian methods for estimating ROC curves and corresponding (marginal) AUCs are developed when a perfect reference standard is not available. In addition, cAUCs are computed to compare the accuracy of different combinations of biomarkers for diagnosis. The methods are evaluated using simulations and are applied to data for Johne's disease (paratuberculosis) in cattle. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
We advocate that medical diagnostic tests should be evaluated at the subunit level instead of the patient level if a disease can occur in multiple parts/units within a patient, for example, vessels, segments, ears, eyes etc. When a non-invasive test is compared to an invasive gold standard test, often not all of the subunits receive the gold standard test and verification bias is present if the subunits without the gold standard test are discarded. Here we address estimation and inference issues in assessing the performance of medical diagnostic tests at the subunit level while accounting for verification bias and the correlation among subunits. We present a weighted least squares approach and demonstrate how the method can be implemented by using the procedure PROC CATMOD from the popular SAS software. A cardiology example is presented and we discuss application of the method to the case of multiple tests and a single gold standard test.  相似文献   

14.
In biomedical studies, there are multiple sources of information available of which only a small number of them are associated with the diseases. It is of importance to select and combine these factors that are associated with the disease in order to predict the disease status of a new subject. The receiving operating characteristic (ROC) technique has been widely used in disease classification, and the classification accuracy can be measured with area under the ROC curve (AUC). In this article, we combine recent variable selection methods with AUC methods to optimize diagnostic accuracy of multiple risk factors. We first describe one new and some recent AUC-based methods for effectively combining multiple risk factors for disease classification. We then apply them to analyze the data from a new clinical study, investigating whether a combination of traditional Chinese medicine symptoms and standard Western medicine risk factors can increase discriminative accuracy in diagnosing osteoporosis (OP). Based on the results, we conclude that we can make a better diagnosis of primary OP by combining traditional Chinese medicine symptoms with Western medicine risk factors.  相似文献   

15.
The diagnosis of Schistosoma haematobium disease on the basis of history and physical examination alone is often difficult. Tests have thus been developed to allow an early and more accurate diagnosis. However these tests have substantial imperfections and many different results obtained from these tests must be integrated into a diagnostic conclusion about the probability of disease in a given patient. Also the accuracy of these tests in detecting S. haematobium disease is critically dependent not only on its sensitivity and specificity but also on the prevalence or pretest likelihood of disease in the population under study. The diagnostic accuracy of haematuria, proteinuria and the combined criteria tests in detecting S. haematobium eggs in schoolchildren are evaluated by calculating the sensitivity and specificity and also by the use of 'Bayes' theorem of conditional probability. The graphic relation between the predictive value of a given test result and the pretest risk of disease in the test subjects was obtained for each of these tests. This method reveals that the prevalence of S. haematobium disease is an important determinant of the predictive value of any test result in the individual patient.  相似文献   

16.
Composite reference standards (CRSs) have been advocated in diagnostic accuracy studies in the absence of a perfect reference standard. The rationale is that combining results of multiple imperfect tests leads to a more accurate reference than any one test in isolation. Focusing on a CRS that classifies subjects as disease positive if at least one component test is positive, we derive algebraic expressions for sensitivity and specificity of this CRS, sensitivity and specificity of a new (index) test compared with this CRS, as well as the CRS‐based prevalence. We use as a motivating example the problem of evaluating a new test for Chlamydia trachomatis, an asymptomatic disease for which no gold‐standard test exists. As the number of component tests increases, sensitivity of this CRS increases at the expense specificity, unless all tests have perfect specificity. Therefore, such a CRS can lead to significantly biased accuracy estimates of the index test. The bias depends on disease prevalence and accuracy of the CRS. Further, conditional dependence between the CRS and index test can lead to over‐estimation of index test accuracy estimates. This commonly‐used CRS combines results from multiple imperfect tests in a way that ignores information and therefore is not guaranteed to improve over a single imperfect reference unless each component test has perfect specificity, and the CRS is conditionally independent of the index test. When these conditions are not met, as in the case of C. trachomatis testing, more realistic statistical models should be researched instead of relying on such CRSs. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
Two common problems in assessing the accuracy of traditional Chinese medicine (TCM) doctors in detecting a particular symptom are the unknown true symptom status and the ordinal-scale of the symptom status. Wang et al. (Biostatistics 2011; DOI: 10.1093/biostatistics/kxq075) proposed a nonparametric maximum likelihood method for estimating the accuracy of different TCM doctors without a gold standard when the true symptom status is measured on an ordinal-scale. A key assumption of their work is that the diagnosis results are independent conditional on the gold standard. This assumption can be violated in many practical situations.In this paper, we propose a random effects modeling approach that extends their method to incorporate dependence structure among different tests or doctors. The proposed method is illustrated on a real data set from TCM, which contains the diagnostic results from five doctors for the same patients regarding symptoms related to Chills disease. The same data set was analyzed by Wang et al. under the conditional independence assumption. In addition, we also discuss an ad hoc test for the model fitting and a likelihood ratio test on the random effects.  相似文献   

18.
ObjectiveMost clinical research evaluates diagnostic tests by accuracy measures such as sensitivity and specificity. This is stimulated by the Standards for Reporting of Diagnostic Accuracy initiative, which focuses on accuracy in the widely accepted guidelines. Referring to the clinical consequences of diagnostic tests, many epidemiologists recognize the importance of patient outcome studies in addition to accuracy studies. However, there is a theoretical argument that stipulates the need for patient outcome studies, which has thus far been overlooked.ResultsUsing a philosophical argument, we show that the definition of disease necessarily involves the concept of function, which in turn is inextricably related to outcome. Consequently, diagnostic tests that establish the presence or absence of disease cannot be evaluated without measuring patient outcome. Patient outcome studies are therefore the definitive means to assess the merits of a diagnostic test.ConclusionThe need for patient outcome studies is not due to pragmatic reasons, as previous authors argued, but is based on a philosophical argument relating the definition of disease to the concept of function. We propose that authors justify the use of accuracy studies in papers reporting diagnostic test evaluation by describing the association between gold standard test and patient outcome.  相似文献   

19.
In diagnostic studies without a gold standard, the assumption on the dependence structure of the multiple tests or raters plays an important role in model performance. In case of binary disease status, both conditional independence and crossed random effects structure have been proposed and their performance investigated. Less attention has been paid to the situation where the true disease status is ordinal. In this paper, we propose crossed subject‐specific and rater‐specific random effects to account for the dependence structure and assess the robustness of the proposed model to misspecification in the random effects distributions. We applied the models to data from the Physician Reliability Study, which focuses on assessing the diagnostic accuracy in a population of raters for the staging of endometriosis, a gynecological disorder in women. Using this new methodology, we estimate the probability of a correct classification and show that regional experts can more easily classify the intermediate stage than resident physicians. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
In practice, usually multiple biomarkers are measured on the same subject for disease diagnosis. Combining these biomarkers into a single score could improve diagnostic accuracy. Many researchers have addressed the problem of finding the optimal linear combination based on maximizing the area under ROC curve (AUC). Actually, such combined score might have less than optimal property at the diagnostic threshold. In this paper, we propose the idea of using Youden index as an objective function for searching the optimal linear combination. The combined score directly achieves the maximum overall correct classification rate at the diagnostic threshold corresponding to Youden index; in other words, it is the optimal linear combination score for making the disease diagnosis. We present both empirical and numerical searching methods for the optimal linear combination. We carry out extensive simulation study to investigate the performance of the proposed methods. Additionally, we empirically compare the optimal overall classification rates between the proposed combination based on Youden index and the traditional one based on AUC and demonstrate a significant gain in diagnostic accuracy for the proposed combination. In the end, we apply the proposed methods to a real data set. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号