首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In the assessment of the accuracy of diagnostic tests for infectious diseases, the true disease status of the subjects is often unknown due to the lack of a gold standard test. Latent class models with two latent classes, representing diseased and non-diseased subjects, are often used to analyze this type of data. In its basic format, latent class analysis requires the observed outcomes to be statistically independent conditional on the disease status. In most diagnostic settings, this assumption is highly questionable. During the last decade, several methods have been proposed to estimate latent class models with conditional dependence between the test results. A class of flexible fixed and random effects models were described by Dendukuri and Joseph in a Bayesian framework. We illustrate these models using the analysis of a diagnostic study of three field tests and an imperfect reference test for the diagnosis of visceral leishmaniasis. We show that, as observed earlier by Albert and Dodd, different dependence models may result in similar fits to the data while resulting in different inferences. Given this problem, selection of appropriate latent class models should be based on substantive subject matter knowledge. If several clinically plausible models are supported by the data, a sensitivity analysis should be performed by describing the results obtained from different models and using different priors. Copyright (c) 2008 John Wiley & Sons, Ltd.  相似文献   

2.
Evaluating the accuracy (ie, estimating the sensitivity and specificity) of new diagnostic tests without the presence of a gold standard is of practical meaning and has been the subject of intensive study for several decades. Existing methods use 2 or more diagnostic tests under several basic assumptions and then estimate the accuracy parameters via the maximum likelihood estimation. One of the basic assumptions is the conditional independence of the tests given the disease status. This assumption is impractical in many real applications in veterinary research. Several methods have been proposed with various dependence models to relax this assumption. However, these methods impose subjective dependence structures, which may not be practical and may introduce additional nuisance parameters. In this article, we propose a simple method for addressing this problem without the conditional independence assumption, using an empirical conditioning approach. The proposed method reduces to the popular Hui‐Walter model in the case of conditional independence. Also, our likelihood function is of order‐2 polynomial in parameters, while that of Hui‐Walter is of order‐3. The reduced model complexity increases the stability in estimation. Simulation studies are conducted to evaluate the performance of the proposed method, which shows overall smaller biases in estimation and is more stable than the existing method, especially when tests are conditionally dependent. Two real data examples are used to illustrate the proposed method.  相似文献   

3.
Intermediate test results often occur with diagnostic tests. When assessing diagnostic accuracy, it is important to properly report and account for these results. In the literature, these results are commonly discarded prior to analysis or treated as either a positive or a negative result. Although such adjustments allow sensitivity and specificity to be computed in the standard way, these forced decisions limit the interpretability and usefulness of the results. Estimation of diagnostic accuracy is further complicated when tests are evaluated without a gold standard. Although traditional latent class modeling can be readily applied to analyze these data and account for intermediate results, these models assume that tests are independent conditional on the true disease status, which is rarely valid in practice. We extend both the log‐linear latent class model and the probit latent class model to accommodate the conditional dependence among tests while taking the intermediate results into consideration. We illustrate our methods using a simulation study and a published medical study on the detection of epileptiform activity in the brain. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

4.
Latent class models (LCMs) can be used to assess diagnostic test performance when no reference test (a gold standard) is available, considering two latent classes representing disease or non-disease status. One of the basic assumptions in such models is that of local or conditional independence: all indicator variables (tests) are statistically independent within each latent class. However, in practice this assumption is often violated; hence, the two-LCM fits the data poorly. In this paper, we propose the use of Biplot methods to identify the conditional dependence between pairs of manifest variables within each latent class. Additionally, we propose incorporating such dependence in the corresponding latent class using the log-linear formulation of the model.  相似文献   

5.
ObjectivesThe objective of this study was to evaluate the performance of goodness-of-fit testing to detect relevant violations of the assumptions underlying the criticized “standard” two-class latent class model. Often used to obtain sensitivity and specificity estimates for diagnostic tests in the absence of a gold reference standard, this model relies on assuming that diagnostic test errors are independent. When this assumption is violated, accuracy estimates may be biased: goodness-of-fit testing is often used to evaluate the assumption and prevent bias.Study Design and SettingWe investigate the performance of goodness-of-fit testing by Monte Carlo simulation. The simulation scenarios are based on three empirical examples.ResultsGoodness-of-fit tests lack power to detect relevant misfit of the standard two-class latent class model at sample sizes that are typically found in empirical diagnostic studies. The goodness-of-fit tests that are based on asymptotic theory are not robust to the sparseness of data. A parametric bootstrap procedure improves the evaluation of goodness of fit in the case of sparse data.ConclusionOur simulation study suggests that relevant violation of the local independence assumption underlying the standard two-class latent class model may remain undetected in empirical diagnostic studies, potentially leading to biased estimates of sensitivity and specificity.  相似文献   

6.
The goal in diagnostic medicine is often to estimate the diagnostic accuracy of multiple experimental tests relative to a gold standard reference. When a gold standard reference is not available, investigators commonly use an imperfect reference standard. This paper proposes methodology for estimating the diagnostic accuracy of multiple binary tests with an imperfect reference standard when information about the diagnostic accuracy of the imperfect test is available from external data sources. We propose alternative joint models for characterizing the dependence between the experimental tests and discuss the use of these models for estimating individual‐test sensitivity and specificity as well as prevalence and multivariate post‐test probabilities (predictive values). We show using analytical and simulation techniques that, as long as the sensitivity and specificity of the imperfect test are high, inferences on diagnostic accuracy are robust to misspecification of the joint model. The methodology is demonstrated with a study examining the diagnostic accuracy of various HIV‐antibody tests for HIV. Published in 2008 by John Wiley & Sons, Ltd.  相似文献   

7.
There is now a large literature on the analysis of diagnostic test data. In the absence of a gold standard test, latent class analysis is most often used to estimate the prevalence of the condition of interest and the properties of the diagnostic tests. When test results are measured on a continuous scale, both parametric and nonparametric models have been proposed. Parametric methods such as the commonly used bi-normal model may not fit the data well; nonparametric methods developed to date have been relatively complex to apply in practice, and their properties have not been carefully evaluated in the diagnostic testing context. In this paper, we propose a simple yet flexible Bayesian nonparametric model which approximates a Dirichlet process for continuous data. We compare results from the nonparametric model with those from the bi-normal model via simulations, investigating both how much is lost in using a nonparametric model when the bi-normal model is correct and how much can be gained in using a nonparametric model when normality does not hold. We also carefully investigate the trade-offs that occur between flexibility and identifiability of the model as different Dirichlet process prior distributions are used. Motivated by an application to tuberculosis clustering, we extend our nonparametric model to accommodate two additional dichotomous tests and proceed to analyze these data using both the continuous test alone as well as all three tests together.  相似文献   

8.
We focus on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. However, we propose retesting only a judicious subsample of the specimens by the new diagnostic test. Subsampling minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. We introduce methods to estimate agreement statistics and conduct symmetry tests when the second test is conducted on only a subsample and no gold standard exists. The methods treat the subsample as a stratified two-phase sample and use inverse-probability weighting. Strata can be any information available on all specimens and can be used to oversample the most informative specimens. The verification bias framework applies if the test conducted on only the subsample is a gold standard. We also present inverse-probability-weighting-based estimators of diagnostic accuracy that take advantage of stratification. We present three examples demonstrating that adequate statistical efficiency can be achieved under subsampling while greatly reducing the number of specimens requiring retesting. Naively using standard estimators that ignore subsampling can lead to drastically misleading estimates. Through simulation, we assess the finite-sample properties of our estimators and consider other possible sampling designs for our examples that could have further improved statistical efficiency. To help promote subsampling designs, our R package CompareTests computes all of our agreement and diagnostic accuracy statistics.  相似文献   

9.
Two key aims of diagnostic research are to accurately and precisely estimate disease prevalence and test sensitivity and specificity. Latent class models have been proposed that consider the correlation between subject measures determined by different tests in order to diagnose diseases for which gold standard tests are not available. In some clinical studies, several measures of the same subject are made with the same test under the same conditions (replicated measurements), and thus, replicated measurements for each subject are not independent. In the present study, we propose an extension of the Bayesian latent class Gaussian random effects model to fit the data with binary outcomes for tests with replicated subject measures. We describe an application using data collected on hookworm infection carried out in the municipality of Presidente Figueiredo, Amazonas State, Brazil. In addition, the performance of the proposed model was compared with that of current models (the subject random effects model and the conditional (in)dependent model) through a simulation study. As expected, the proposed model presented better accuracy and precision in the estimations of prevalence, sensitivity and specificity. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

10.
In many areas of medical research, 'gold standard' diagnostic tests do not exist and so evaluating the performance of standardized diagnostic criteria or algorithms is problematic. In this paper we propose an approach to evaluating the operating characteristics of diagnoses using a latent class model. By defining 'true disease' as our latent variable, we are able to estimate sensitivity, specificity and negative and positive predictive values of the diagnostic test. These methods are applied to diagnostic criteria for depression using Baltimore's Epidemiologic Catchment Area Study Wave 3 data.  相似文献   

11.
Sensitivity, specificity, positive and negative predictive value are typically used to quantify the accuracy of a binary screening test. In some studies, it may not be ethical or feasible to obtain definitive disease ascertainment for all subjects using a gold standard test. When a gold standard test cannot be used, an imperfect reference test that is less than 100 per cent sensitive and specific may be used instead. In breast cancer screening, for example, follow-up for cancer diagnosis is used as an imperfect reference test for women where it is not possible to obtain gold standard results. This incomplete ascertainment of true disease, or differential disease verification, can result in biased estimates of accuracy. In this paper, we derive the apparent accuracy values for studies subject to differential verification. We determine how the bias is affected by the accuracy of the imperfect reference test, the percent who receive the imperfect reference standard test not receiving the gold standard, the prevalence of the disease, and the correlation between the results for the screening test and the imperfect reference test. It is shown that designs with differential disease verification can yield biased estimates of accuracy. Estimates of sensitivity in cancer screening trials may be substantially biased. However, careful design decisions, including selection of the imperfect reference test, can help to minimize bias. A hypothetical breast cancer screening study is used to illustrate the problem.  相似文献   

12.
Standard diagnostic test procedures involve dichotomization of serologic test results. The critical value or cut-off is determined to optimize a trade off between sensitivity and specificity of the resulting test. When sampled units from a population are tested, they are allocated as either infected or not according to the test outcome. Units with values high above the cut-off are treated the same as units with values just barely above the cut-off, and similarly for values below the cut-off. There is an inherent information loss in dichotomization. We thus develop a diagnostic screening method based on data that are not dichotomized within the Bayesian paradigm. Our method determines the predictive probability of infection for each individual in a sample based on having observed a specific serologic test result and provides inferences about the prevalence of infection in the population sampled. Our fully Bayesian method is briefly compared with a previously developed frequentist method. We illustrate the methodology with serologic data that have been previously analysed in the veterinary literature, and also discuss applications to screening for disease in humans. The method applies more generally to a variation of the classic parametric 2-population discriminant analysis problem. Here, in addition to training data, additional units are sampled and the goal is to determine their population status, and the prevalence(s) of the subpopulation(s) from which they were sampled.  相似文献   

13.
Medical diagnostic tests must enjoy appropriate validity and high reliability in order to qualify as adequate assessment tools. Without a gold standard test, available medical diagnostic tests are not perfect; hence, the reliability of such tests must be evaluated precisely. Kappa coefficient statistics are often utilized to assess reliability of tests when there are two or more medical diagnostic tests. However, the statistics are imprecise for a typical case when the prevalence rate of a target disease is unknown. Although latent class models could be used to assess reliability, the models cannot estimate reliability in the case of two tests, due to unidentifiability or the lack of degrees of freedom. An alternative approach to assess reliability for the case of two tests is stratifying a two‐by‐two contingency table under the assumption that sensitivities and specificities between the two tests be equal over all strata and that prevalence rates in the strata be different from each other. Because stratification is basically a multi‐sample analysis, it should not be applied to the situation where subsamples (i.e., centers) are randomly selected from a larger population. In this article, a type of mixed‐effect model is proposed to evaluate the reliability of two tests for trials in randomly selected multiple centers. Several types of distributions for prevalence rates over subpopulations are considered. Simulation studies show that our proposed method performs nicely. Analysis of real data is also reported. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

14.
Shu Y  Liu A  Li Z 《Statistics in medicine》2007,26(24):4416-4427
In a study evaluating a medical diagnostic test, human samples are valuable and often costly, therefore prime concerns require termination of the study if the test is evidently inefficient (or efficient) in diagnosis of diseases in order to keep the number of samples as low as possible. In this paper, we propose sequential designs to evaluate the sensitivity and specificity of a diagnostic test. One method allows early stopping if the sensitivity and specificity of a new medical test are both within the level of tolerance. Another method terminates the study if either the sensitivity or the specificity is below the minimally acceptable level. The latter method minimizes the expected sample size when the test does not meet expectations on performance, and illustrates substantial advantage of having smaller expected sample sizes in various two-stage designs compared to the sample sizes of single-stage designs when a diagnostic test is not promising.  相似文献   

15.
In a meta‐analysis of diagnostic accuracy studies, the sensitivities and specificities of a diagnostic test may depend on the disease prevalence since the severity and definition of disease may differ from study to study due to the design and the population considered. In this paper, we extend the bivariate nonlinear random effects model on sensitivities and specificities to jointly model the disease prevalence, sensitivities and specificities using trivariate nonlinear random‐effects models. Furthermore, as an alternative parameterization, we also propose jointly modeling the test prevalence and the predictive values, which reflect the clinical utility of a diagnostic test. These models allow investigators to study the complex relationship among the disease prevalence, sensitivities and specificities; or among test prevalence and the predictive values, which can reveal hidden information about test performance. We illustrate the proposed two approaches by reanalyzing the data from a meta‐analysis of radiological evaluation of lymph node metastases in patients with cervical cancer and a simulation study. The latter illustrates the importance of carefully choosing an appropriate normality assumption for the disease prevalence, sensitivities and specificities, or the test prevalence and the predictive values. In practice, it is recommended to use model selection techniques to identify a best‐fitting model for making statistical inference. In summary, the proposed trivariate random effects models are novel and can be very useful in practice for meta‐analysis of diagnostic accuracy studies. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
Breast cancer patients after breast conservation therapy often develop ipsilateral breast tumor relapse (IBTR), whose classification (true local recurrence versus new ipsilateral primary tumor) is subject to error, and there is no available gold standard. Some patients may die because of breast cancer before IBTR develops. Because this terminal event may be related to the individual patient's unobserved disease status and time to IBTR, the terminal mechanism is non‐ignorable. This article presents a joint analysis framework to model the binomial regression with misclassified binary outcome and the correlated time to IBTR, subject to a dependent terminal event and in the absence of a gold standard. Shared random effects are used to link together two survival times. The proposed approach is evaluated by a simulation study and is applied to a breast cancer data set consisting of 4477 breast cancer patients. The proposed joint model can be conveniently fit using adaptive Gaussian quadrature tools implemented in SAS 9.3 (SAS Institute Inc., Cary, NC, USA) procedure NLMIXED . Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

17.
We address the problem of joint analysis of more than one series of longitudinal measurements. The typical way of approaching this problem is as a joint mixed effects model for the two outcomes. Apart from the large number of parameters needed to specify such a model, perhaps the biggest drawback of this approach is the difficulty in interpreting the results of the model, particularly when the main interest is in the relation between the two longitudinal outcomes. Here we propose an alternative approach to this problem. We use a latent class joint model for the longitudinal outcomes in order to reduce the dimensionality of the problem. We then use a two-stage estimation procedure to estimate the parameters in this model. In the first stage, the latent classes, their probabilities and the mean and covariance structure are estimated based on the longitudinal data of the first outcome. In the second stage, we study the relation between the latent classes and patient characteristics and the other outcome(s). We apply the method to data from 195 consecutive lung cancer patients in two outpatient clinics of lung diseases in The Hague, and we study the relation between denial and longitudinal health measures. Our approach clearly revealed an interesting phenomenon: although no difference between classes could be detected for objective measures of health, patients in classes representing higher levels of denial consistently scored significantly higher in subjective measures of health. Copyright (c) 2008 John Wiley & Sons, Ltd.  相似文献   

18.
金辉  刘沛 《环境与职业医学》2010,27(12):735-738
[目的]探讨无金标准条件下诊断试验准确性评价的潜分类方法。[方法]介绍潜分类模型在无金标准诊断试验评价中的原理、试验设计和评价方法,用两人群两试验实例说明潜分类方法的应用。[结果]对于二分类反应变量,假设条件独立和试验准确性稳定,至少需要两个人群两种试验方法或一个人群三种试验方法才能满足模型可识别性并用于频率学派统计评价;贝叶斯统计不需满足模型的可识别性,但需引入先验分布,且存在先验依赖性。[结论]潜分类方法可用于无金标准时的诊断试验评价,但要选择适合的试验设计和评价方法。  相似文献   

19.
Tests for disease often produce a continuous measure, such as the concentration of some biomarker in a blood sample. In clinical practice, a threshold C is selected such that results, say, greater than C are declared positive and those less than C negative. Measures of test accuracy such as sensitivity and specificity depend crucially on C, and the optimal value of this threshold is usually a key question for clinical practice. Standard methods for meta-analysis of test accuracy (i) do not provide summary estimates of accuracy at each threshold, precluding selection of the optimal threshold, and furthermore, (ii) do not make use of all available data. We describe a multinomial meta-analysis model that can take any number of pairs of sensitivity and specificity from each study and explicitly quantifies how accuracy depends on C. Our model assumes that some prespecified or Box-Cox transformation of test results in the diseased and disease-free populations has a logistic distribution. The Box-Cox transformation parameter can be estimated from the data, allowing for a flexible range of underlying distributions. We parameterise in terms of the means and scale parameters of the two logistic distributions. In addition to credible intervals for the pooled sensitivity and specificity across all thresholds, we produce prediction intervals, allowing for between-study heterogeneity in all parameters. We demonstrate the model using two case study meta-analyses, examining the accuracy of tests for acute heart failure and preeclampsia. We show how the model can be extended to explore reasons for heterogeneity using study-level covariates.  相似文献   

20.
Current advances in technology provide less invasive or less expensive diagnostic tests for identifying disease status. When a diagnostic test is evaluated against an invasive or expensive gold standard test, one often finds that not all patients undergo the gold standard test. The sensitivity and specificity estimates based only on the patients with verified disease are often biased. This bias is called verification bias. Many authors have examined the consequences of verification bias and have proposed bias correction methods based on the assumption of independence between disease status and election for verification conditionally on the test result, or equivalently on the assumption that the disease status is missing at random using missing data terminology. This assumption may not be valid and one may need to consider adjustment for a possible non-ignorable verification bias resulting from the non-ignorable missing data mechanism. Such an adjustment involves ultimately uncheckable assumptions and requires sensitivity analysis. The sensitivity analysis is most often accomplished by perturbing parameters in the chosen model for the missing data mechanism, and it has a local flavour because perturbations are around the fitted model. In this paper we propose a global sensitivity analysis for assessing performance of a diagnostic test in the presence of verification bias. We derive a region of all sensitivity and specificity values consistent with the observed data and call this region a test ignorance region (TIR). The term 'ignorance' refers to the lack of knowledge due to the missing disease status for the not verified patients. The methodology is illustrated with two clinical examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号