首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 536 毫秒
1.
Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance on the basis of the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two‐stage cluster designs for surveillance sampling is often more cost‐effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible nonparametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
In population‐based household surveys, for example, the National Health and Nutrition Examination Survey, households are often sampled by stratified multistage cluster sampling, and multiple individuals related by blood are often sampled within households. Therefore, genetic data collected from these population‐based household surveys, called National Genetic Household Surveys, can be correlated because of two levels of correlation. One level of correlation is caused by the multistage geographical cluster sampling and the other is caused by biological inheritance among participants within the same sampled family. In this paper, we develop an efficient Hardy Weinberg Equilibrium (HWE) test utilizing pairwise composite likelihood methods that incorporate the sample weighting effect induced by the differential selection probabilities in complex sample designs, as well as the two‐level clustering (correlation) effects described above. Monte Carlo simulation studies show that the proposed HWE test maintains the nominal levels, and is more powerful than existing methods (Li et al. 2011) under various (non)informative sample designs that depend on genotypes (explicitly or implicitly), family relationships or both, especially when within‐household sampling depends on the genotypes. The developed tests are further evaluated using simulated genetic data based on the Hispanic Health and Nutrition Survey. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

3.
Background: The Australian population that relies on mobile phones exclusively has increased from 5% in 2005 to 29% in 2014. Failing to include this mobile‐only population leads to a potential bias in estimates from landline‐based telephone surveys. This paper considers the impacts on selected health prevalence estimates with and without the mobile‐only population. Methods: Using data from the Australian Health Survey – which, for the first time, included a question on telephone status – we examined demographic, geographic and health differences between the landline‐accessible and mobile‐only population. These groups were also compared to the full population, controlling for the sampling design and differential non‐response patterns in the observed sample through weighting and benchmarking. Results: The landline‐accessible population differs from the mobile‐only population for selected health measures resulting in biased prevalence estimates for smoking, alcohol risk and private health insurance coverage in the full population. The differences remain even after adjusting for age and gender. Conclusions: Using landline telephones only for conducting population health surveys will have an impact on prevalence rate estimates of health risk factors due to the differing profiles of the mobile‐only population from the landline‐accessible population.  相似文献   

4.
Analysis of population‐based case–control studies with complex sampling designs is challenging because the sample selection probabilities (and, therefore, the sample weights) depend on the response variable and covariates. Commonly, the design‐consistent (weighted) estimators of the parameters of the population regression model are obtained by solving (sample) weighted estimating equations. Weighted estimators, however, are known to be inefficient when the weights are highly variable as is typical for case–control designs. In this paper, we propose two alternative estimators that have higher efficiency and smaller finite sample bias compared with the weighted estimator. Both methods incorporate the information included in the sample weights by modeling the sample expectation of the weights conditional on design variables. We discuss benefits and limitations of each of the two proposed estimators emphasizing efficiency and robustness. We compare the finite sample properties of the two new estimators and traditionally used weighted estimators with the use of simulated data under various sampling scenarios. We apply the methods to the U.S. Kidney Cancer Case‐Control Study to identify risk factors. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

5.
单纯随机抽样设计在社区人群调查中的应用   总被引:1,自引:0,他引:1  
目的 探讨单纯随机抽样设计在社区人群调查中的可行性及调查样本质量.方法 在杭州市下城区和拱墅区根据社区居民电子底册以单纯随机抽样方法 抽取居民户,对户内18~64岁个体采用KISH方法 随机抽取一名个体,两区各需完成500人的调查.结果 下城区抽取950户,完成调查511户(53.8%);拱墅区抽取1380户,完成调查506户(36.7%).两区因户内不符合年龄要求的个体、原户搬迁、社区集体拆迁、底册错误等导致的无应答分别为38.3%和43.5%;各种原因导致的户(或抽中个体)无应答(或拒答)分别为8.0%和19.9%.调查样本与随机抽样户人群的年龄、性别构成无差异.随机抽样户人群与杭州市市区人群性别构成无差异,但年龄结构偏大.结论 在地域相对局限的社区中,基于社区居民电子底册实施单纯随机抽样具备可行性,对调查员的入户时间提出要求,可保证调查样本对抽样框的代表性.  相似文献   

6.
The receiver operating characteristic (ROC) curve can be utilized to evaluate the performance of diagnostic tests. The area under the ROC curve (AUC) is a widely used summary index for comparing multiple ROC curves. Both parametric and nonparametric methods have been developed to estimate and compare the AUCs. However, these methods are usually only applicable to data collected from simple random samples and not surveys and epidemiologic studies that use complex sample designs such as stratified and/or multistage cluster sampling with sample weighting. Such complex samples can inflate variances from intra‐cluster correlation and alter the expectations of test statistics because of the use of sample weights that account for differential sampling rates. In this paper, we modify the nonparametric method to incorporate sampling weights to estimate the AUC and employ leaving‐one‐out jackknife methods along with the balanced repeated replication method to account for the effects of the complex sampling in the variance estimation of our proposed estimators of the AUC. The finite sample properties of our methods are evaluated using simulations, and our methods are illustrated by comparing the estimated AUC for predicting overweight/obesity using different measures of body weight and adiposity among sampled children and adults in the US Hispanic Health and Nutrition Examination Survey. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
The cost efficiency of estimation of sensitivity, specificity and positive predictive value from two-stage sampling designs is considered, assuming a relatively cheap test classifies first-stage subjects into several categories and an expensive gold standard is applied at stage two. Simple variance formulae are derived and used to find optimal designs for a given cost ratio. The utility of two-stage designs is measured by the reduction in variances compared with one-stage simple random designs. Separate second-stage design is also compared with proportional allocation (PA). The maximum percentage reductions in variance from two-stage designs for sensitivity, specificity and positive predictive value estimation are P per cent, (1-P) per cent and W, respectively, where P is the population prevalence of disease and W the population percentage of test negatives. The optimum allocation of stage-two resources is not obvious: the optimum proportion of true cases at stage two may even be less than under PA. PA is near optimal for sensitivity estimation in most cases when prevalence is low, but inefficient compared with the optimal scheme for specificity.  相似文献   

8.
Recently, several study designs incorporating treatment effect assessment in biomarker‐based subpopulations have been proposed. Most statistical methodologies for such designs focus on the control of type I error rate and power. In this paper, we have developed point estimators for clinical trials that use the two‐stage adaptive enrichment threshold design. The design consists of two stages, where in stage 1, patients are recruited in the full population. Stage 1 outcome data are then used to perform interim analysis to decide whether the trial continues to stage 2 with the full population or a subpopulation. The subpopulation is defined based on one of the candidate threshold values of a numerical predictive biomarker. To estimate treatment effect in the selected subpopulation, we have derived unbiased estimators, shrinkage estimators, and estimators that estimate bias and subtract it from the naive estimate. We have recommended one of the unbiased estimators. However, since none of the estimators dominated in all simulation scenarios based on both bias and mean squared error, an alternative strategy would be to use a hybrid estimator where the estimator used depends on the subpopulation selected. This would require a simulation study of plausible scenarios before the trial.  相似文献   

9.
Population studies often seek to examine phenomena in important population subgroups or to compare results among these and other subgroups. When subgroups of interest comprise a relatively small percentage of the population and acceptable subgroup member lists are not available to serve as sampling frames, it may be prohibitively expensive even by telephone to screen through a sample of the entire population. This paper considers some statistical effects of estimation from a class of two-stratum telephone sample designs where part of the frame with a higher subgroup concentration is disproportionately sampled compared to the rest of the frame. Using proportionate sampling as a reference, the relative impact of this disproportionate design is determined for nominal and effective sample sizes, where the latter are tied to the effect of variation in sample weights that occurs in disproportionately allocated samples. Findings are illustrated using two recent telephone surveys. Whereas nominal subgroup sample sizes may be improved by disproportionate sampling, we conclude that both the survey designer and analyst should use this type of design cautiously in telephone surveys.  相似文献   

10.
为比较不同整群抽样设计方法 的抽样误差及设计效应,评价不等概率抽样在死因监测中的应用效果.以陕西省107个县(市、区)作为抽样框架,采用等概率整群抽样和不等概率整群抽样等设计方案抽取样本,用复杂抽样方法 计算不同方案样本的抽样误差和设计效应.不同的抽样方案得到不同的抽样误差估计,分层整群抽样的标准误小于完全随机整群抽样;不等概率抽样(πPS抽样)的设计效率虽略逊于等概率的完全随机整群抽样,但扩大了监测范围.结论 :对于抽样框架明确的整群抽样调查数据,在统计分析时不应脱离预先设定的抽样设计方案和设计参数.死因监测采用不等概率抽样设计,能增加样本的权重,提高死亡率的地区代表性.  相似文献   

11.
In this paper, we consider the design for comparing the performance of two binary classification rules, for example, two record linkage algorithms or two screening tests. Statistical methods are well developed for comparing these accuracy measures when the gold standard is available for every unit in the sample, or in a two‐phase study when the gold standard is ascertained only in the second phase in a subsample using a fixed sampling scheme. However, these methods do not attempt to optimize the sampling scheme to minimize the variance of the estimators of interest. In comparing the performance of two classification rules, the parameters of primary interest are the difference in sensitivities, specificities, and positive predictive values. We derived the analytic variance formulas for these parameter estimates and used them to obtain the optimal sampling design. The efficiency of the optimal sampling design is evaluated through an empirical investigation that compares the optimal sampling with simple random sampling and with proportional allocation. Results of the empirical study show that the optimal sampling design is similar for estimating the difference in sensitivities and in specificities, and both achieve a substantial amount of variance reduction with an over‐sample of subjects with discordant results and under‐sample of subjects with concordant results. A heuristic rule is recommended when there is no prior knowledge of individual sensitivities and specificities, or the prevalence of the true positive findings in the study population. The optimal sampling is applied to a real‐world example in record linkage to evaluate the difference in classification accuracy of two matching algorithms. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

12.
Many observational studies adopt what we call retrospective convenience sampling (RCS). With the sample size in each arm prespecified, RCS randomly selects subjects from the treatment‐inclined subpopulation into the treatment arm and those from the control‐inclined into the control arm. Samples in each arm are representative of the respective subpopulation, but the proportion of the 2 subpopulations is usually not preserved in the sample data. We show in this work that, under RCS, existing causal effect estimators actually estimate the treatment effect over the sample population instead of the underlying study population. We investigate how to correct existing methods for consistent estimation of the treatment effect over the underlying population. Although RCS is adopted in medical studies for ethical and cost‐effective purposes, it also has a big advantage for statistical inference: When the tendency to receive treatment is low in a study population, treatment effect estimators under RCS, with proper correction, are more efficient than their parallels under random sampling. These properties are investigated both theoretically and through numerical demonstration.  相似文献   

13.
In this tutorial, we describe regression-based methods for analysing multiple source data arising from complex sample survey designs. We use the term 'multiple-source' data to encompass all cases where data are simultaneously obtained from multiple informants, or raters (e.g. self-reports, family members, health care providers, administrators) or via different/parallel instruments, indicators or methods (e.g. symptom rating scales, standardized diagnostic interviews, or clinical diagnoses). We review regression models for analysing multiple source risk factors or multiple source outcomes and show that they can be considered special cases of generalized linear models, albeit with correlated outcomes. We show how these methods can be extended to handle the common survey features of stratification, clustering, and sampling weights. We describe how to fit regression models with multiple source reports derived from complex sample surveys using general purpose statistical software. Finally, the methods are illustrated using data from two studies: the Stirling County Study and the Eastern Connecticut Child Survey.  相似文献   

14.
Most studies that follow subjects over time are challenged by having some subjects who dropout. Double sampling is a design that selects and devotes resources to intensively pursue and find a subset of these dropouts, then uses data obtained from these to adjust naïve estimates, which are potentially biased by the dropout. Existing methods to estimate survival from double sampling assume a random sample. In limited‐resource settings, however, generating accurate estimates using a minimum of resources is important. We propose using double‐sampling designs that oversample certain profiles of dropouts as more efficient alternatives to random designs. First, we develop a framework to estimate the survival function under these profile double‐sampling designs. We then derive the precision of these designs as a function of the rule for selecting different profiles, in order to identify more efficient designs. We illustrate using data from the United States President's Emergency Plan for AIDS Relief‐funded HIV care and treatment program in western Kenya. Our results show why and how more efficient designs should oversample patients with shorter dropout times. Further, our work suggests generalizable practice for more efficient double‐sampling designs, which can help maximize efficiency in resource‐limited settings. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

15.
In order to better inform study design decisions when sampling patients within and across health care providers we develop a simulation-based approach for designing complex multi-stage samples. The approach explores the tradeoff between competing design goals such as precision of estimates, coverage of the target population and cost.We elicit a number of sensible candidate designs, evaluate these designs with respect to multiple sampling goals, investigate their tradeoffs, and identify the design that is the best compromise among all goals. This approach recognizes that, in the practice of sampling, precision of the estimates is not the only important goal, and that there are tradeoffs with coverage and cost that should be explicitly considered. One can easily add other goals. We construct a sample frame with all phase III clinical cancer treatment trials that are conducted by cooperative oncology groups of the National Cancer Institute from October 1, 1998 through December 31, 1999. Simulation results for our study suggest sampling a different number of trials and institutions than initially considered.Simulations of different study designs can uncover efficiency gains both in terms of improved precision of the estimates and in terms of improved coverage of the target population. Simulations enable us to explore the tradeoffs between competing sampling goals and to quantify these efficiency gains. This is true even for complex designs where the stages are not strictly nested in one another.  相似文献   

16.
In observational and experimental studies in the health sciences involving human populations, it is sometimes considered desirable to recruit subjects according to designs that specify a predetermined number of subjects in each of several mutually exclusive classes (generally but not necessarily demographic in nature). This type of adaptive sampling design, now generally referred to as multiple inverse sampling (MIS), has received recent attention, and estimation methods are now available for several sequential MIS sampling designs. In this class of designs, subjects are sampled randomly and sequentially, usually one at a time, until all classes have the pre-specified number of subjects. In this paper, we extend MIS for finite population sampling to estimation of the parameters in multiple logistic regression under MIS. Using estimated logistic regression parameters and cost components obtained from the Isfahan Healthy Heart Program (IHHP), we report findings from a simulation experiment in which it appears that, at fixed cost, MIS at the last stage of sampling compares favourably to simple random sampling. The IHHP is a large community intervention study for prevention of cardiovascular disease being conducted in Isfahan, Iran and two other cities in Iran. The IHHP identified subjects through a multistage sample survey in which MIS was used at the final stage of sampling. MIS is one of several methods of adaptive sampling that are generating considerable interest and show promise of being useful in a wide variety of applications.  相似文献   

17.
This paper studies quantile regression analysis with maxima or minima nomination sampling designs. These designs are often used to obtain more representative samples from the tails of the underlying distribution using the easy to access rank information during the sampling process. We propose new loss functions to incorporate the rank information of nominated samples in the estimation process. Also, we provide an alternative approach that translates estimation problems with nominated samples to corresponding problems under simple random sampling (SRS). Strategies are given to choose proper nomination sampling designs for a given population quantile. Numerical studies show that quantile regression models with maxima (or minima) nominated samples have higher relative efficiencies compared with their counterparts under SRS for analyzing the upper (or lower) tail quantiles of the distribution of the response variable. Results are then implemented on a large cohort study in the Canadian province of Manitoba to analyze quantiles of bone mineral density using available covariates. We show that in some cases, methods based on nomination sampling designs require about one‐tenth of the sample used in SRS to estimate the lower or upper tail conditional quantiles with comparable mean squared errors. This is a dramatic reduction in time and cost compared with the usual SRS approach.  相似文献   

18.
19.
Originally, 2‐stage group testing was developed for efficiently screening individuals for a disease. In response to the HIV/AIDS epidemic, 1‐stage group testing was adopted for estimating prevalences of a single or multiple traits from testing groups of size q, so individuals were not tested. This paper extends the methodology of 1‐stage group testing to surveys with sample weighted complex multistage‐cluster designs. Sample weighted‐generalized estimating equations are used to estimate the prevalences of categorical traits while accounting for the error rates inherent in the tests. Two difficulties arise when using group testing in complex samples: (1) How does one weight the results of the test on each group as the sample weights will differ among observations in the same group. Furthermore, if the sample weights are related to positivity of the diagnostic test, then group‐level weighting is needed to reduce bias in the prevalence estimation; (2) How does one form groups that will allow accurate estimation of the standard errors of prevalence estimates under multistage‐cluster sampling allowing for intracluster correlation of the test results. We study 5 different grouping methods to address the weighting and cluster sampling aspects of complex designed samples. Finite sample properties of the estimators of prevalences, variances, and confidence interval coverage for these grouping methods are studied using simulations. National Health and Nutrition Examination Survey data are used to illustrate the methods.  相似文献   

20.
Population size and density estimates are needed to plan resource requirements and plan health related interventions. Sampling frames are not always available necessitating surveys using non-standard household sampling methods. These surveys are time-consuming, difficult to validate, and their implementation could be optimised. Here, we discuss an example of an optimisation procedure for rapid population estimation using T-Square sampling which has been used recently to estimate population sizes in emergencies. A two-stage process was proposed to optimise the T-Square method wherein the first stage optimises the sample size and the second stage optimises the pathway connecting the sampling points. The proposed procedure yields an optimal solution if the distribution of households is described by a spatially homogeneous Poisson process and can be sub-optimal otherwise. This research provides the first step in exploring how optimisation techniques could be applied to survey designs thereby providing more timely and accurate information for planning interventions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号