首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the problem of estimating the prevalence of a disease under a group testing framework. Because assays are usually imperfect, misclassification of disease status is a major challenge in prevalence estimation. To account for possible misclassification, it is usually assumed that the sensitivity and specificity of the assay are known and independent of the group size. This assumption is often questionable, and substitution of incorrect values of an assay's sensitivity and specificity can result in a large bias in the prevalence estimate, which we refer to as the mis‐substitution bias. In this article, we propose simple designs and methods for prevalence estimation that do not require known values of assay sensitivity and specificity. If a gold standard test is available, it can be applied to a validation subsample to yield information on the imperfect assay's sensitivity and specificity. When a gold standard is unavailable, it is possible to estimate assay sensitivity and specificity, either as unknown constants or as specified functions of the group size, from group testing data with varying group size. We develop methods for estimating parameters and for finding or approximating optimal designs, and perform extensive simulation experiments to evaluate and compare the different designs. An example concerning human immunodeficiency virus infection is used to illustrate the validation subsample design. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
In this work, we describe a two‐stage sampling design to estimate the infection prevalence in a population. In the first stage, an imperfect diagnostic test was performed on a random sample of the population. In the second stage, a different imperfect test was performed in a stratified random sample of the first sample. To estimate infection prevalence, we assumed conditional independence between the diagnostic tests and develop method of moments estimators based on expectations of the proportions of people with positive and negative results on both tests that are functions of the tests' sensitivity, specificity, and the infection prevalence. A closed‐form solution of the estimating equations was obtained assuming a specificity of 100% for both tests. We applied our method to estimate the infection prevalence of visceral leishmaniasis according to two quantitative polymerase chain reaction tests performed on blood samples taken from 4756 patients in northern Ethiopia. The sensitivities of the tests were also estimated, as well as the standard errors of all estimates, using a parametric bootstrap. We also examined the impact of departures from our assumptions of 100% specificity and conditional independence on the estimated prevalence. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

3.
目的本次研究以第三次全国血吸虫病流行病学调查为背景,对其部分抽样过程进行计算机模拟,采用负二项分布抽样方法,得到感染率的无偏估计,并与传统的抽样方法比较,综合评价两种抽样方法的优缺点。方法分别在样本量相同及样本量不同两种情况下对抽样结果估计感染率的绝对误差、相对误差及正确率作统计学描述分析,并综合评价。结果在相同样本量下,两种抽样方法估计的感染率在绝对误差、相对误差、正确率及可信区间宽度方面差别的P值均大于0.05(当感染率为0.6%时,两者的正确率及可信区间宽度差别P值接近0.05);在样本量不同时,两种抽样方法估计的感染率在正确率方面差异无统计学意义(P值均大于0.05),在绝对误差、相对误差及可信区间方面差别的P值均小于0.01,仅在感染率较高时(大于10%)两者差异无统计学意义。结论在样本量一致情况下,两种抽样方法在不同的感染率范围内的估计精度相当。当实际感染率较小时(如小于1%),采用负二项分布抽样可实现抽到足够的患者;当实际感染率未知且无法预测时,该方法又是一种探索性的抽样方法。  相似文献   

4.
In this paper we propose a sample size calculation method for testing on a binomial proportion when binary observations are dependent within clusters. In estimating the binomial proportion in clustered binary data, two weighting systems have been popular: equal weights to clusters and equal weights to units within clusters. When the number of units varies cluster by cluster, performance of these two weighting systems depends on the extent of correlation among units within each cluster. In addition to them, we will also use an optimal weighting method that minimizes the variance of the estimator. A sample size formula is derived for each of the estimators with different weighting schemes. We apply these methods to the sample size calculation for the sensitivity of a periodontal diagnostic test. Simulation studies are conducted to evaluate a finite sample performance of the three estimators. We also assess the influence of misspecified input parameter values on the calculated sample size. The optimal estimator requires equal or smaller sample sizes and is more robust to the misspecification of an input parameter than those assigning equal weights to units or clusters.  相似文献   

5.
The design of epidemiologic studies for the validation of diagnostic tests necessitates accurate sample size calculations to allow for the estimation of diagnostic sensitivity and specificity within a specified level of precision and with the desired level of confidence. Confidence intervals based on the normal approximation to the binomial do not achieve the specified coverage when the proportion is close to 1. A sample size algorithm based on the exact mid-P method of confidence interval estimation was developed to address the limitations of normal approximation methods. This algorithm resulted in sample sizes that achieved the appropriate confidence interval width even in situations when normal approximation methods performed poorly.  相似文献   

6.
OBJECTIVE: Problems arising with the estimation of sensitivity and specificity when two imperfect diagnostic tests are applied are widely discussed. Effects on the estimation of prevalence may be of importance as well. Different methods of dealing with two or more imperfect tests and unknown reference standard are contrasted with regard to their implications on prevalence estimation: discrepant analysis, composite reference standards, and latent class models. STUDY DESIGN AND SETTING: Prospective epidemiological multicenter study to determine the prevalence of respiratory syncytial virus in children with lower respiratory tract infections. A subsample of 1,003 patients from a hospital population and from a practice population is considered. Virus isolation, polymerase chain reaction, and rapid antigen test had been applied. RESULTS: Prevalence estimates obtained under various assumptions ranged from 0.263 to 0.386 in the hospital population and from 0.214 to 0.277 in the practice population. CONCLUSION: Estimation procedures involving a resolver test applied to some but not all cells are at risk of introducing a serious bias in prevalence estimation as well as in the estimation of test accuracy parameters. Estimation via latent class modeling may be more useful, but care should be taken regarding the underlying assumptions.  相似文献   

7.
We consider efficient study designs to estimate sensitivity and specificity of a candidate diagnostic or screening test. Our focus is the setting in which the candidate test is inexpensive to administer compared to evaluation of disease status, and the test results, available in a large cohort, can be used as a basis for sampling subjects for verification of disease status. We examine designs in which disease status is verified in a sample chosen so as to optimize estimation of either sensitivity or specificity. We then propose a sequential design in which the first step of sampling is conducted to efficiently estimate specificity. If the candidate test is determined to be of sufficient specificity, then step two of sampling is conducted to estimate sensitivity. We propose estimators based on this sequential sampling scheme, and show that the performance of these estimators is excellent. We develop sample size calculations for the sequential design, and show that this design, in most situations, compares favourably in terms of expected sample size to a fixed size design.  相似文献   

8.
On the non-inferiority of a diagnostic test based on paired observations   总被引:1,自引:0,他引:1  
Lu Y  Jin H  Genant HK 《Statistics in medicine》2003,22(19):3029-3044
Non-inferiority of a diagnostic test to the standard or the optimum test is a common issue in medical research. Often we want to determine if a new diagnostic test is as good as the standard reference test. Sometimes we are interested in an inexpensive test that may have an acceptably inferior sensitivity or specificity. While hypothesis testing procedures and sample size formulae for the equivalence of sensitivity or specificity alone have been proposed, very few studies have discussed simultaneous comparisons of both parameters. In this paper, we present three different testing procedures and sample size formulae for simultaneous comparison of sensitivity and specificity based on paired observations and with known disease status. These statistical procedures are then used to compare two classification rules that identify women for future osteoporotic fracture. Simulation experiments demonstrate that the new tests and sample size formulae give the appropriate type I and II error rates. Differences between our approach and the approach of Lui and Cumberland are discussed.  相似文献   

9.
To give a quantitative guide to sample size allocation for developing sampling designs for a food composition survey, we discuss sampling strategies that consider the importance of each food; namely, consumption or production, variability of composition, and the restrictions within the available resources for sample collection and analysis are considered., Here we consider two strategies: 'proportional' and 'Neyman' are discussed. Both of these incorporate consumed quantity of foods, and we review some available statistics for allocation issues. The Neyman optimal strategy allocates less sample size for starch than proportional, because the former incorporates variability in the composition. Those strategies improved accuracy in dietary nutrient intake more than equal sample size allocation. Those strategies will be useful as we often face sample size allocation problems, wherein we decide whether to sample 'five white potatoes and five taros or nine white and one taros'. Allocating sufficient sample size for important foodstuffs is essential in assuring data quality. Nevertheless, the food composition table should be as comprehensive as possible.  相似文献   

10.
To give a quantitative guide to sample size allocation for developing sampling designs for a food composition survey, we discuss sampling strategies that consider the importance of each food; namely, consumption or production, variability of composition, and the restrictions within the available resources for sample collection and analysis are considered., Here we consider two strategies: ‘proportional’ and ‘Neyman’ are discussed. Both of these incorporate consumed quantity of foods, and we review some available statistics for allocation issues. The Neyman optimal strategy allocates less sample size for starch than proportional, because the former incorporates variability in the composition. Those strategies improved accuracy in dietary nutrient intake more than equal sample size allocation. Those strategies will be useful as we often face sample size allocation problems, wherein we decide whether to sample ‘five white potatoes and five taros or nine white and one taros’. Allocating sufficient sample size for important foodstuffs is essential in assuring data quality. Nevertheless, the food composition table should be as comprehensive as possible.  相似文献   

11.
Pooling DNA samples can yield efficient estimates of the prevalence of genetic variants. We extend methods of analyzing pooled DNA samples to estimate the joint prevalence of variants at two or more loci. If one has a sample from the general population, one can adapt the method for joint prevalence estimation to estimate allele frequencies and D, the measure of linkage disequilibrium. The parameter D is fundamental in population genetics and in determining the power of association studies. In addition, joint allelic prevalences can be used in case-control studies to estimate the relative risks of disease from joint exposures to the genetic variants. Our methods allow for imperfect assay sensitivity and specificity. The expected savings in numbers of assays required when pooling is utilized compared to individual testing are quantified.  相似文献   

12.
Diagnostic tests rarely provide perfect results. The misclassification induced by imperfect sensitivities and specificities of diagnostic tests must be accounted for when planning prevalence studies or investigations into properties of new tests. The previous work has shown that applying a single imperfect test to estimate prevalence can often result in very large sample size requirements, and that sometimes even an infinite sample size is insufficient for precise estimation because the problem is non‐identifiable. Adding a second test can sometimes reduce the sample size substantially, but infinite sample sizes can still occur as the problem remains non‐identifiable. We investigate the further improvement possible when three diagnostic tests are to be applied. We first develop methods required for studies when three conditionally independent tests are available, using different Bayesian criteria. We then apply these criteria to prototypic scenarios, showing that large sample size reductions can occur compared to when only one or two tests are used. As the problem is now identifiable, infinite sample sizes cannot occur except in pathological situations. Finally, we relax the conditional independence assumption, demonstrating in this once again non‐identifiable situation that sample sizes may substantially grow and possibly be infinite. We apply our methods to the planning of two infectious disease studies, the first designed to estimate the prevalence of Strongyloides infection, and the second relating to estimating the sensitivity of a new test for tuberculosis transmission. The much smaller sample sizes that are typically required when three as compared to one or two tests are used should encourage researchers to plan their studies using more than two diagnostic tests whenever possible. User‐friendly software is available for both design and analysis stages greatly facilitating the use of these methods. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

13.
Often the performances of two binary diagnostic or screening tests are compared by applying them to the same set of subjects, some of whom are affected, some unaffected. The McNemar test, and corresponding interval estimation methods, may be used to compare the sensitivity of the two tests, but this disregards both any observed difference in specificity and its imprecision due to sampling variation. The suggested approach is to display point and interval estimates for a weighted mean f of the differences in sensitivity and specificity between the two tests. The mixing parameter lambda, which is allowed to range from 0 to 1, represents the prevalence in the population to which application is envisaged, together with the relative seriousness of false positives and false negatives. The confidence interval for f is obtained by a simple extension of a closed-form method for the paired difference of proportions, which has favourable coverage properties and is based on the Wilson single proportion score method. A plot of f against lambda is readily obtained using a Minitab macro.  相似文献   

14.
We propose a new, less costly, design to test the equivalence of digital versus analogue mammography in terms of sensitivity and specificity. Because breast cancer is a rare event among asymptomatic women, the sample size for testing equivalence of sensitivity is larger than that for testing equivalence of specificity. Hence calculations of sample size are based on sensitivity. With the proposed design it is possible to achieve the same power as a completely paired design by increasing the number of less costly analogue mammograms and not giving the more expensive digital mammograms to some randomly selected subjects who are negative on the analogue mammogram. The key idea is that subjects who are negative on the analogue mammogram are unlikely to have cancer and hence contribute less information for estimating sensitivity than subjects who are positive on the analogue mammogram. To ascertain disease state among subjects not biopsied, we propose another analogue mammogram at a later time determined by a natural history model. The design differs from a double sampling design because it compares two imperfect tests instead of combining information from a perfect and imperfect test. © 1998 John Wiley & Sons, Ltd.  相似文献   

15.
BACKGROUND AND OBJECTIVE: Publication bias and other sample size effects are issues for meta-analyses of test accuracy, as for randomized trials. We investigate limitations of standard funnel plots and tests when applied to meta-analyses of test accuracy and look for improved methods. METHODS: Type I and type II error rates for existing and alternative tests of sample size effects were estimated and compared in simulated meta-analyses of test accuracy. RESULTS: Type I error rates for the Begg, Egger, and Macaskill tests are inflated for typical diagnostic odds ratios (DOR), when disease prevalence differs from 50% and when thresholds favor sensitivity over specificity or vice versa. Regression and correlation tests based on functions of effective sample size are valid, if occasionally conservative, tests for sample size effects. Empirical evidence suggests that they have adequate power to be useful tests. When DORs are heterogeneous, however, all tests of funnel plot asymmetry have low power. CONCLUSION: Existing tests that use standard errors of odds ratios are likely to be seriously misleading if applied to meta-analyses of test accuracy. The effective sample size funnel plot and associated regression test of asymmetry should be used to detect publication bias and other sample size related effects.  相似文献   

16.
The cost efficiency of estimation of sensitivity, specificity and positive predictive value from two-stage sampling designs is considered, assuming a relatively cheap test classifies first-stage subjects into several categories and an expensive gold standard is applied at stage two. Simple variance formulae are derived and used to find optimal designs for a given cost ratio. The utility of two-stage designs is measured by the reduction in variances compared with one-stage simple random designs. Separate second-stage design is also compared with proportional allocation (PA). The maximum percentage reductions in variance from two-stage designs for sensitivity, specificity and positive predictive value estimation are P per cent, (1-P) per cent and W, respectively, where P is the population prevalence of disease and W the population percentage of test negatives. The optimum allocation of stage-two resources is not obvious: the optimum proportion of true cases at stage two may even be less than under PA. PA is near optimal for sensitivity estimation in most cases when prevalence is low, but inefficient compared with the optimal scheme for specificity.  相似文献   

17.
Timely and accurate information on disease load is essential for planning health programs. Unfortunately, complexity, cost and need of skilled personnel limit the use of screening tools of high validity in developing countries. The disease load estimated with tools of low validity differs considerably from true disease load, particularly for diseases of extreme levels of prevalence/incidence. A tool of 70% sensitivity and specificity may yield a prevalence/incidence rate of 34% (CI: 32.23-35.67%) for a disease whose true rate is only 10.0% (CI: 8.94-11.06%). We proposed a procedure to derive the true estimate in such cases, based on the concepts of sensitivity and specificity of a diagnostic/screening test. It is applied on two sets of real data--one pertaining to incidence rate of low birth weight (LBW) and the other to prevalence rate of obesity--where multiple screening tests of varying validity were used to estimate the magnitude. Different screening tests yielded widely varying incidence/prevalence rates of LBW/obesity. The prevalence/incidence rates derived by using the proposed estimation procedure are similar and close to the true estimate obtained by screening tests considered as gold standard. Further, sample size determined on the basis of the results of a tool of low validity may be either larger or smaller than the required sample size. Estimation of true disease load enables determination of correct sample size, thus improving the precision of the estimate and, in some instances, reducing the cost of investigation.  相似文献   

18.
Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance on the basis of the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two‐stage cluster designs for surveillance sampling is often more cost‐effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible nonparametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

19.
An analysis of serologic tests for treponemal disease performed in Australian aboriginal communities is used to illustrate factors influencing the predictive value positive of serologic tests. Simple extrapolation of predictive value estimates, from prevalence, sensitivity, and specificity data, is complicated by variation of specificity between strata of a population, and the weighting of true positive and false positive results produced by the criteria for selecting individuals for testing. The predictive value positive of a particular test is greatest when it is used for incident cases suspected of having a disease, but lower when the same test is used to screen a whole population unless longitudinal data are available to exclude individuals with past disease. Preferential testing of individuals without active disease may produce very low predictive values. Empirical estimation of the predictive value of tests provides objective guidelines for decision making and enables increased predictability by modification of testing criteria.  相似文献   

20.
OBJECTIVE: Our goal was to evaluate whether screening patients with diabetes for microalbuminuria (MA) is effective according to the criteria developed by Frame and Carlson and those of the US Preventive Services Task Force. STUDY DESIGN: We searched the MEDLINE database (1966-present) and bibliographies of relevant articles. OUTCOMES MEASURED: We evaluated the impact of MA screening using published criteria for periodic health screening tests. The effect of the correlation between repeated tests on the accuracy of a currently recommended testing strategy was analyzed. RESULTS: Quantitative tests have reported sensitivities from 56% to 100% and specificities from 81% to 98%. Semiquantitative tests for MA have reported sensitivities from 51% to 100% and specificities from 21% to 100%. First morning, morning, or random urine sampling appear feasible. Assuming an individual test sensitivity of 90%, a specificity of 90%, and a 10% prevalence of MA, the correlation between tests would have to be lower than 0.1 to achieve a positive predictive value for repeated testing of 75%. CONCLUSIONS: Screening for MA meets only 4 of 6 Frame and Carlson criteria for evaluating screening tests. The recommended strategies to overcome diagnostic uncertainty by using repeated testing are based on expert opinion, are difficult to follow in primary care settings, do not improve diagnostic accuracy sufficiently, and have not been tested in a controlled trial. Although not advocated by the American Diabetes Association, semiquantitative MA screening tests using random urine sampling have acceptable accuracy but may not be reliable in all settings.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号