首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Analysis of population‐based case–control studies with complex sampling designs is challenging because the sample selection probabilities (and, therefore, the sample weights) depend on the response variable and covariates. Commonly, the design‐consistent (weighted) estimators of the parameters of the population regression model are obtained by solving (sample) weighted estimating equations. Weighted estimators, however, are known to be inefficient when the weights are highly variable as is typical for case–control designs. In this paper, we propose two alternative estimators that have higher efficiency and smaller finite sample bias compared with the weighted estimator. Both methods incorporate the information included in the sample weights by modeling the sample expectation of the weights conditional on design variables. We discuss benefits and limitations of each of the two proposed estimators emphasizing efficiency and robustness. We compare the finite sample properties of the two new estimators and traditionally used weighted estimators with the use of simulated data under various sampling scenarios. We apply the methods to the U.S. Kidney Cancer Case‐Control Study to identify risk factors. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

2.
Estimation of marginal causal effects from case‐control data has two complications: (i) confounding due to the fact that the exposure under study is not randomized, and (ii) bias from the case‐control sampling scheme. In this paper, we study estimators of the marginal causal odds ratio, addressing these issues for matched and unmatched case‐control designs when utilizing the knowledge of the known prevalence of being a case. The estimators are implemented in simulations where their finite sample properties are studied and approximations of their variances are derived with the delta method. Also, we illustrate the methods by analyzing the effect of low birth weight on the risk of type 1 diabetes mellitus using data from the Swedish Childhood Diabetes Register, a nationwide population‐based incidence register. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
Outcome‐dependent sampling (ODS) scheme is a cost‐effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well‐known such design is the case‐control design for binary response, the case‐cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under‐developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome‐dependent sampling (multivariate‐ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate‐ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple‐random‐sample portion of the multivariate‐ODS or the estimator from a simple random sample with the same sample size. The multivariate‐ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

4.
In survival analyses, inverse‐probability‐of‐treatment (IPT) and inverse‐probability‐of‐censoring (IPC) weighted estimators of parameters in marginal structural Cox models are often used to estimate treatment effects in the presence of time‐dependent confounding and censoring. In most applications, a robust variance estimator of the IPT and IPC weighted estimator is calculated leading to conservative confidence intervals. This estimator assumes that the weights are known rather than estimated from the data. Although a consistent estimator of the asymptotic variance of the IPT and IPC weighted estimator is generally available, applications and thus information on the performance of the consistent estimator are lacking. Reasons might be a cumbersome implementation in statistical software, which is further complicated by missing details on the variance formula. In this paper, we therefore provide a detailed derivation of the variance of the asymptotic distribution of the IPT and IPC weighted estimator and explicitly state the necessary terms to calculate a consistent estimator of this variance. We compare the performance of the robust and consistent variance estimators in an application based on routine health care data and in a simulation study. The simulation reveals no substantial differences between the 2 estimators in medium and large data sets with no unmeasured confounding, but the consistent variance estimator performs poorly in small samples or under unmeasured confounding, if the number of confounders is large. We thus conclude that the robust estimator is more appropriate for all practical purposes.  相似文献   

5.
A wide variety of estimators of the between‐study variance are available in random‐effects meta‐analysis. Many, but not all, of these estimators are based on the method of moments. The DerSimonian‐Laird estimator is widely used in applications, but the Paule‐Mandel estimator is an alternative that is now recommended. Recently, DerSimonian and Kacker have developed two‐step moment‐based estimators of the between‐study variance. We extend these two‐step estimators so that multiple (more than two) steps are used. We establish the surprising result that the multistep estimator tends towards the Paule‐Mandel estimator as the number of steps becomes large. Hence, the iterative scheme underlying our new multistep estimator provides a hitherto unknown relationship between two‐step estimators and Paule‐Mandel estimator. Our analysis suggests that two‐step estimators are not necessarily distinct estimators in their own right; instead, they are quantities that are closely related to the usual iterative scheme that is used to calculate the Paule‐Mandel estimate. The relationship that we establish between the multistep and Paule‐Mandel estimator is another justification for the use of the latter estimator. Two‐step and multistep estimators are perhaps best conceptualized as approximate Paule‐Mandel estimators.  相似文献   

6.
The case‐control study is a common design for assessing the association between genetic exposures and a disease phenotype. Though association with a given (case‐control) phenotype is always of primary interest, there is often considerable interest in assessing relationships between genetic exposures and other (secondary) phenotypes. However, the case‐control sample represents a biased sample from the general population. As a result, if this sampling framework is not correctly taken into account, analyses estimating the effect of exposures on secondary phenotypes can be biased leading to incorrect inference. In this paper, we address this problem and propose a general approach for estimating and testing the population effect of a genetic variant on a secondary phenotype. Our approach is based on inverse probability weighted estimating equations, where the weights depend on genotype and the secondary phenotype. We show that, though slightly less efficient than a full likelihood‐based analysis when the likelihood is correctly specified, it is substantially more robust to model misspecification, and can out‐perform likelihood‐based analysis, both in terms of validity and power, when the model is misspecified. We illustrate our approach with an application to a case‐control study extracted from the Framingham Heart Study. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

7.
Biomarkers are often measured over time in epidemiological studies and clinical trials for better understanding of the mechanism of diseases. In large cohort studies, case‐cohort sampling provides a cost effective method to collect expensive biomarker data for revealing the relationship between biomarker trajectories and time to event. However, biomarker measurements are often limited by the sensitivity and precision of a given assay, resulting in data that are censored at detection limits and prone to measurement errors. Additionally, the occurrence of an event of interest may preclude biomarkers from being further evaluated. Inappropriate handling of these types of data can lead to biased conclusions. Under a classical case cohort design, we propose a modified likelihood‐based approach to accommodate these special features of longitudinal biomarker measurements in the accelerated failure time models. The maximum likelihood estimators based on the full likelihood function are obtained by Gaussian quadrature method. We evaluate the performance of our case‐cohort estimator and compare its relative efficiency to the full cohort estimator through simulation studies. The proposed method is further illustrated using the data from a biomarker study of sepsis among patients with community acquired pneumonia. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.
Multiple papers have studied the use of gene‐environment (GE) independence to enhance power for testing gene‐environment interaction in case‐control studies. However, studies that evaluate the role of GE independence in a meta‐analysis framework are limited. In this paper, we extend the single‐study empirical Bayes type shrinkage estimators proposed by Mukherjee and Chatterjee (2008) to a meta‐analysis setting that adjusts for uncertainty regarding the assumption of GE independence across studies. We use the retrospective likelihood framework to derive an adaptive combination of estimators obtained under the constrained model (assuming GE independence) and unconstrained model (without assumptions of GE independence) with weights determined by measures of GE association derived from multiple studies. Our simulation studies indicate that this newly proposed estimator has improved average performance across different simulation scenarios than the standard alternative of using inverse variance (covariance) weighted estimators that combines study‐specific constrained, unconstrained, or empirical Bayes estimators. The results are illustrated by meta‐analyzing 6 different studies of type 2 diabetes investigating interactions between genetic markers on the obesity related FTO gene and environmental factors body mass index and age.  相似文献   

9.
Case-control sampling is frequently used in genetic association studies to examine the relationship between disease and genetic exposures. Such designs usually collect extensive information on phenotypes beyond the primary disease, whose associations with the genetic exposures are also of great interest. Because the cases are over-sampled, appropriate analysis of secondary phenotypes should take into account this biased sampling design. We previously introduced a weighting-based estimator for appropriate secondary analysis, but have not thoroughly explored its statistical properties. In this article, we revisit our previous estimator to offer new insights and methodological extensions. Specifically, we extend our previous estimator and construct its more general form based on generalized least squares (GLS). Such an extension allows us to connect the GLS estimator with the generalized method of moments and motivates a new specification test designed to assess the adequacy of the disease model or the weights. The specification test statistic measures the weighted discrepancy between the case and control subsample estimators, and asymptotically follows a central Chi-squared distribution under correct disease model specification. We illustrate the GLS estimator and specification test using a case-control sample of peripheral arterial disease, and use simulations to further shed light on the operating characteristics of the specification test.  相似文献   

10.
Outcome‐dependent sampling (ODS) scheme is a cost‐effective way to conduct a study. For a study with continuous primary outcome, an ODS scheme can be implemented where the expensive exposure is only measured on a simple random sample and supplemental samples selected from 2 tails of the primary outcome variable. With the tremendous cost invested in collecting the primary exposure information, investigators often would like to use the available data to study the relationship between a secondary outcome and the obtained exposure variable. This is referred as secondary analysis. Secondary analysis in ODS designs can be tricky, as the ODS sample is not a random sample from the general population. In this article, we use the inverse probability weighted and augmented inverse probability weighted estimating equations to analyze the secondary outcome for data obtained from the ODS design. We do not make any parametric assumptions on the primary and secondary outcome and only specify the form of the regression mean models, thus allow an arbitrary error distribution. Our approach is robust to second‐ and higher‐order moment misspecification. It also leads to more precise estimates of the parameters by effectively using all the available participants. Through simulation studies, we show that the proposed estimator is consistent and asymptotically normal. Data from the Collaborative Perinatal Project are analyzed to illustrate our method.  相似文献   

11.
In this paper, we compare the robustness properties of a matching estimator with a doubly robust estimator. We describe the robustness properties of matching and subclassification estimators by showing how misspecification of the propensity score model can result in the consistent estimation of an average causal effect. The propensity scores are covariate scores, which are a class of functions that removes bias due to all observed covariates. When matching on a parametric model (e.g., a propensity or a prognostic score), the matching estimator is robust to model misspecifications if the misspecified model belongs to the class of covariate scores. The implication is that there are multiple possibilities for the matching estimator in contrast to the doubly robust estimator in which the researcher has two chances to make reliable inference. In simulations, we compare the finite sample properties of the matching estimator with a simple inverse probability weighting estimator and a doubly robust estimator. For the misspecifications in our study, the mean square error of the matching estimator is smaller than the mean square error of both the simple inverse probability weighting estimator and the doubly robust estimators.  相似文献   

12.
Many longitudinal databases record the occurrence of recurrent events over time. In this article, we propose a new method to estimate the average causal effect of a binary treatment for recurrent event data in the presence of confounders. We propose a doubly robust semiparametric estimator based on a weighted version of the Nelson-Aalen estimator and a conditional regression estimator under an assumed semiparametric multiplicative rate model for recurrent event data. We show that the proposed doubly robust estimator is consistent and asymptotically normal. In addition, a model diagnostic plot of residuals is presented to assess the adequacy of our proposed semiparametric model. We then evaluate the finite sample behavior of the proposed estimators under a number of simulation scenarios. Finally, we illustrate the proposed methodology via a database of circus artist injuries.  相似文献   

13.
The weighted average treatment effect is a causal measure for the comparison of interventions in a specific target population, which may be different from the population where data are sampled from. For instance, when the goal is to introduce a new treatment to a target population, the question is what efficacy (or effectiveness) can be gained by switching patients from a standard of care (control) to this new treatment, for which the average treatment effect for the control estimand can be applied. In this paper, we propose two estimators based on augmented inverse probability weighting to estimate the weighted average treatment effect for a well-defined target population (ie, there exists a predefined target function of covariates that characterizes the population of interest, for example, a function of age to focus on elderly diabetic patients using samples from the US population). The first proposed estimator is doubly robust if the target function is known or can be correctly specified. The second proposed estimator is doubly robust if the target function has a linear dependence on the propensity score, which can be used to estimate the average treatment effect for the treated and the average treatment effect for the control. We demonstrate the properties of the proposed estimators through theoretical proof and simulation studies. We also apply our proposed methods in a comparison of glucagon-like peptide-1 receptor agonists therapy and insulin therapy among patients with type 2 diabetes, using the UK Clinical Practice Research Datalink data.  相似文献   

14.
In randomized trials, pair‐matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow‐up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As a consequence, the observed data cannot be considered as n/2 independent, identically distributed pairs of units, as common practice assumes. Instead, the observed data consist of n dependent units. This paper explores the consequences of adaptive pair‐matching in randomized trials for estimation of the average treatment effect, conditional the baseline covariates of the n study units. By avoiding estimation of the covariate distribution, estimators of this conditional effect will often be more precise than estimators of the marginal effect. We contrast the unadjusted estimator with targeted minimum loss based estimation and show substantial efficiency gains from matching and further gains with adjustment. This work is motivated by the Sustainable East Africa Research in Community Health study, an ongoing community randomized trial to evaluate the impact of immediate and streamlined antiretroviral therapy on HIV incidence in rural East Africa. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

15.
Population prevalence rates of dementia using stratified sampling have previously been estimated using two methods: standard weighted estimates and a logistic model-based approach. An earlier study described this application of the model-based approach and reported a small computer simulation comparing the performance of this estimator to the standard weighted estimator. In this article we use large-scale computer simulations based on data from the recently completed Kame survey of prevalent dementia in the Japanese-American residents of King County, Washington, to describe the performance of these estimators. We found that the standard weighted estimator was unbiased. This estimator performed well for a sample design with proportional allocation, but performed poorly for a sample design that included large strata that were lightly sampled. The logistic model-based estimator performed consistently well for all sample designs considered in terms of the extent of variability in estimation, although some modest bias was observed.  相似文献   

16.
This study challenges two core conventional meta‐analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random‐effects meta‐analysis when there is publication (or small‐sample) bias and better than a fixed‐effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small‐sample) bias and identical to fixed‐effect meta‐analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed‐effect meta‐analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
18.
Lui KJ 《Statistics in medicine》2003,22(15):2443-2457
The attributable risk (AR) is one of the most important and commonly-used epidemiological indices to assess the public health importance of an association between a risk factor and a disease. When the underlying risk factor has multiple exposure levels in the presence of confounders, we consider the case-control studies using random sampling to collect the cases and controls here. We develop four asymptotic interval estimators for AR, including the interval estimator using Wald's statistic, the interval estimator using the logarithmic transformation, the interval estimator using the logit transformation, and the interval estimator derived from a quadratic equation. We apply Monte Carlo simulation to evaluate the finite-sample performance of these interval estimators in a variety of situations. We demonstrate that given an adequately large sample size, all the estimators developed here can actually perform reasonably well. We note that the interval estimator using the logit transformation may be of limited use when the number of studied subjects is not large. We also note that the interval estimator using the logarithmic transformation can lose efficiency compared to the interval estimator using Wald's statistic or the interval estimator derived from a quadratic equation developed in this paper. Finally, we use the data taken from a case control study of the oral contraceptive use in myocardial infarction patients with various smoking levels to illustrate he practical usefulness of these estimators.  相似文献   

19.
Lui KJ 《Statistics in medicine》2005,24(8):1275-1285
The discussions on interval estimation of the proportion ratio (PR) of responses or the relative risk (RR) of a disease for multiple matching have been generally focused on the odds ratio (OR) based on the assumption that the latter can approximate the former well. When the underlying proportion of outcomes is not rare, however, the results for the OR would be inadequate for use if the PR or RR was the parameter of our interest. In this paper, we develop five asymptotic interval estimators of the common PR (or RR) for multiple matching. To evaluate and compare the finite sample performance of these estimators, we apply Monte Carlo simulation to calculate the coverage probability and the average length of the resulting confidence intervals in a variety of situations. We note that when we have a constant number of matching, the interval estimator using the logarithmic transformation of the Mantel-Haenszel estimator, the interval estimator derived from the quadratic inequality given in this paper, and the interval estimator using the logarithmic transformation of the ratio estimator can consistently perform well. When the number of matching varies between matched sets, we find that the interval estimator using the logarithmic transformation of the ratio estimator is probably the best among the five interval estimators considered here in the case of a small number (=20) of matched sets. To illustrate the use of these interval estimators, we employ the data studying the supplemental ascorbate in the supportive treatment of terminal cancer patients.  相似文献   

20.
The process by which patients experience a series of recurrent events, such as hospitalizations, may be subject to death. In cohort studies, one strategy for analyzing such data is to fit a joint frailty model for the intensities of the recurrent event and death, which estimates covariate effects on the two event types while accounting for their dependence. When certain covariates are difficult to obtain, however, researchers may only have the resources to subsample patients on whom to collect complete data: one way is using the nested case–control (NCC) design, in which risk set sampling is performed based on a single outcome. We develop a general framework for the design of NCC studies in the presence of recurrent and terminal events and propose estimation and inference for a joint frailty model for recurrence and death using data arising from such studies. We propose a maximum weighted penalized likelihood approach using flexible spline models for the baseline intensity functions. Two standard error estimators are proposed: a sandwich estimator and a perturbation resampling procedure. We investigate operating characteristics of our estimators as well as design considerations via a simulation study and illustrate our methods using two studies: one on recurrent cardiac hospitalizations in patients with heart failure and the other on local recurrence and metastasis in patients with breast cancer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号