首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Outcome‐dependent sampling (ODS) scheme is a cost‐effective way to conduct a study. For a study with continuous primary outcome, an ODS scheme can be implemented where the expensive exposure is only measured on a simple random sample and supplemental samples selected from 2 tails of the primary outcome variable. With the tremendous cost invested in collecting the primary exposure information, investigators often would like to use the available data to study the relationship between a secondary outcome and the obtained exposure variable. This is referred as secondary analysis. Secondary analysis in ODS designs can be tricky, as the ODS sample is not a random sample from the general population. In this article, we use the inverse probability weighted and augmented inverse probability weighted estimating equations to analyze the secondary outcome for data obtained from the ODS design. We do not make any parametric assumptions on the primary and secondary outcome and only specify the form of the regression mean models, thus allow an arbitrary error distribution. Our approach is robust to second‐ and higher‐order moment misspecification. It also leads to more precise estimates of the parameters by effectively using all the available participants. Through simulation studies, we show that the proposed estimator is consistent and asymptotically normal. Data from the Collaborative Perinatal Project are analyzed to illustrate our method.  相似文献   

2.
When an initial case‐control study is performed, data can be used in a secondary analysis to evaluate the effect of the case‐defining event on later outcomes. In this paper, we study the example in which the role of the event is changed from a response variable to a treatment of interest. If the aim is to estimate marginal effects, such as average effects in the population, the sampling scheme needs to be adjusted for. We study estimators of the average effect of the treatment in a secondary analysis of matched and unmatched case‐control data where the probability of being a case is known. For a general class of estimators, we show the components of the bias resulting from ignoring the sampling scheme and demonstrate a design‐weighted matching estimator of the average causal effect. In simulations, the finite sample properties of the design‐weighted matching estimator are studied. Using a Swedish diabetes incidence register with a matched case‐control design, we study the effect of childhood onset diabetes on the use of antidepressant medication as an adult. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

3.
Cluster randomized trials evaluate the effect of a treatment on persons nested within clusters, where treatment is randomly assigned to clusters. Current equations for the optimal sample size at the cluster and person level assume that the outcome variances and/or the study costs are known and homogeneous between treatment arms. This paper presents efficient yet robust designs for cluster randomized trials with treatment‐dependent costs and treatment‐dependent unknown variances, and compares these with 2 practical designs. First, the maximin design (MMD) is derived, which maximizes the minimum efficiency (minimizes the maximum sampling variance) of the treatment effect estimator over a range of treatment‐to‐control variance ratios. The MMD is then compared with the optimal design for homogeneous variances and costs (balanced design), and with that for homogeneous variances and treatment‐dependent costs (cost‐considered design). The results show that the balanced design is the MMD if the treatment‐to control cost ratio is the same at both design levels (cluster, person) and within the range for the treatment‐to‐control variance ratio. It still is highly efficient and better than the cost‐considered design if the cost ratio is within the range for the squared variance ratio. Outside that range, the cost‐considered design is better and highly efficient, but it is not the MMD. An example shows sample size calculation for the MMD, and the computer code (SPSS and R) is provided as supplementary material. The MMD is recommended for trial planning if the study costs are treatment‐dependent and homogeneity of variances cannot be assumed.  相似文献   

4.
We consider a latent variable hazard model for clustered survival data where clusters are a random sample from an underlying population. We allow interactions between the random cluster effect and covariates. We use a maximum pseudo-likelihood estimator to estimate the mean hazard ratio parameters. We propose a bootstrap sampling scheme to obtain an estimate of the variance of the proposed estimator. Application of this method in large multi-centre clinical trials allows one to assess the mean treatment effect, where we consider participating centres as a random sample from an underlying population. We evaluate properties of the proposed estimators via extensive simulation studies. A real data example from the Studies of Left Ventricular Dysfunction (SOLVD) Prevention Trial illustrates the method. © 1997 by John Wiley & Sons, Ltd.  相似文献   

5.
In multilevel populations, there are two types of population means of an outcome variable ie, the average of all individual outcomes ignoring cluster membership and the average of cluster-specific means. To estimate the first mean, individuals can be sampled directly with simple random sampling or with two-stage sampling (TSS), that is, sampling clusters first, and then individuals within the sampled clusters. When cluster size varies in the population, three TSS schemes can be considered, ie, sampling clusters with probability proportional to cluster size and then sampling the same number of individuals per cluster; sampling clusters with equal probability and then sampling the same percentage of individuals per cluster; and sampling clusters with equal probability and then sampling the same number of individuals per cluster. Unbiased estimation of the average of all individual outcomes is discussed under each sampling scheme assuming cluster size to be informative. Furthermore, the three TSS schemes are compared in terms of efficiency with each other and with simple random sampling under the constraint of a fixed total sample size. The relative efficiency of the sampling schemes is shown to vary across different cluster size distributions. However, sampling clusters with probability proportional to size is the most efficient TSS scheme for many cluster size distributions. Model-based and design-based inference are compared and are shown to give similar results. The results are applied to the distribution of high school size in Italy and the distribution of patient list size for general practices in England.  相似文献   

6.
In many settings, an analysis goal is the identification of a factor, or set of factors associated with an event or outcome. Often, these associations are then used for inference and prediction. Unfortunately, in the big data era, the model building and exploration phases of analysis can be time‐consuming, especially if constrained by computing power (ie, a typical corporate workstation). To speed up this model development, we propose a novel subsampling scheme to enable rapid model exploration of clustered binary data using flexible yet complex model set‐ups (GLMMs with additive smoothing splines). By reframing the binary response prospective cohort study into a case‐control–type design, and using our knowledge of sampling fractions, we show one can approximate the model estimates as would be calculated from a full cohort analysis. This idea is extended to derive cluster‐specific sampling fractions and thereby incorporate cluster variation into an analysis. Importantly, we demonstrate that previously computationally prohibitive analyses can be conducted in a timely manner on a typical workstation. The approach is applied to analysing risk factors associated with adverse reactions relating to blood donation.  相似文献   

7.
Impairment caused by Parkinson's disease (PD) is multidimensional (e.g., sensoria, functions, and cognition) and progressive. Its multidimensional nature precludes a single outcome to measure disease progression. Clinical trials of PD use multiple categorical and continuous longitudinal outcomes to assess the treatment effects on overall improvement. A terminal event such as death or dropout can stop the follow‐up process. Moreover, the time to the terminal event may be dependent on the multivariate longitudinal measurements. In this article, we consider a joint random‐effects model for the correlated outcomes. A multilevel item response theory model is used for the multivariate longitudinal outcomes and a parametric accelerated failure time model is used for the failure time because of the violation of proportional hazard assumption. These two models are linked via random effects. The Bayesian inference via MCMC is implemented in ‘BUGS ’ language. Our proposed method is evaluated by a simulation study and is applied to DATATOP study, a motivating clinical trial to determine if deprenyl slows the progression of PD. © 2013 The authors. Statistics in Medicine published by John Wiley & Sons, Ltd.  相似文献   

8.
In clinical trials, it is often desirable to evaluate the effect of a prognostic factor such as a marker response on a survival outcome. However, the marker response and survival outcome are usually associated with some potentially unobservable factors. In this case, the conventional statistical methods that model these two outcomes separately may not be appropriate. In this paper, we propose a joint model for marker response and survival outcomes for clustered data, providing efficient statistical inference by considering these two outcomes simultaneously. We focus on a special type of marker response: a binary outcome, which is investigated together with survival data using a cluster-specific multivariate random effect variable. A multivariate penalized likelihood method is developed to make statistical inference for the joint model. However, the standard errors obtained from the penalized likelihood method are usually underestimated. This issue is addressed using a jackknife resampling method to obtain a consistent estimate of standard errors. We conduct extensive simulation studies to assess the finite sample performance of the proposed joint model and inference methods in different scenarios. The simulation studies show that the proposed joint model has excellent finite sample properties compared to the separate models when there exists an underlying association between the marker response and survival data. Finally, we apply the proposed method to a symptom control study conducted by Canadian Cancer Trials Group to explore the prognostic effect of covariates on pain control and overall survival.  相似文献   

9.
The random effect Tobit model is a regression model that accommodates both left‐ and/or right‐censoring and within‐cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood‐based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the ‘Average Predicted Value’ method to estimate the model‐predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi‐Newton optimization algorithm with Gauss–Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

10.
In this paper, we propose a class of Box–Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta‐analysis. Our modeling formulation uses a multivariate normal response meta‐analysis model with multivariate random effects, in which each response is allowed to have its own Box–Cox transformation. Prior distributions are specified for the Box–Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol‐lowering drugs where the goal is to jointly model the three‐dimensional response consisting of low density lipoprotein cholesterol (LDL‐C), high density lipoprotein cholesterol (HDL‐C), and triglycerides (TG) (LDL‐C, HDL‐C, TG). Because the joint distribution of (LDL‐C, HDL‐C, TG) is not multivariate normal and in fact quite skewed, a Box–Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
The use of outcome‐dependent sampling with longitudinal data analysis has previously been shown to improve efficiency in the estimation of regression parameters. The motivating scenario is when outcome data exist for all cohort members but key exposure variables will be gathered only on a subset. Inference with outcome‐dependent sampling designs that also incorporates incomplete information from those individuals who did not have their exposure ascertained has been investigated for univariate but not longitudinal outcomes. Therefore, with a continuous longitudinal outcome, we explore the relative contributions of various sources of information toward the estimation of key regression parameters using a likelihood framework. We evaluate the efficiency gains that alternative estimators might offer over random sampling, and we offer insight into their relative merits in select practical scenarios. Finally, we illustrate the potential impact of design and analysis choices using data from the Cystic Fibrosis Foundation Patient Registry.  相似文献   

12.
In observational studies, estimation of average causal treatment effect on a patient's response should adjust for confounders that are associated with both treatment exposure and response. In addition, the response, such as medical cost, may have incomplete follow‐up. In this article, a double robust estimator is proposed for average causal treatment effect for right censored medical cost data. The estimator is double robust in the sense that it remains consistent when either the model for the treatment assignment or the regression model for the response is correctly specified. Double robust estimators increase the likelihood the results will represent a valid inference. Asymptotic normality is obtained for the proposed estimator, and an estimator for the asymptotic variance is also derived. Simulation studies show good finite sample performance of the proposed estimator and a real data analysis using the proposed method is provided as illustration. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
In network meta‐analyses that synthesize direct and indirect comparison evidence concerning multiple treatments, multivariate random effects models have been routinely used for addressing between‐studies heterogeneities. Although their standard inference methods depend on large sample approximations (eg, restricted maximum likelihood estimation) for the number of trials synthesized, the numbers of trials are often moderate or small. In these situations, standard estimators cannot be expected to behave in accordance with asymptotic theory; in particular, confidence intervals cannot be assumed to exhibit their nominal coverage probabilities (also, the type I error probabilities of the corresponding tests cannot be retained). The invalidity issue may seriously influence the overall conclusions of network meta‐analyses. In this article, we develop several improved inference methods for network meta‐analyses to resolve these problems. We first introduce 2 efficient likelihood‐based inference methods, the likelihood ratio test–based and efficient score test–based methods, in a general framework of network meta‐analysis. Then, to improve the small‐sample inferences, we developed improved higher‐order asymptotic methods using Bartlett‐type corrections and bootstrap adjustment methods. The proposed methods adopt Monte Carlo approaches using parametric bootstraps to effectively circumvent complicated analytical calculations of case‐by‐case analyses and to permit flexible application to various statistical models network meta‐analyses. These methods can also be straightforwardly applied to multivariate meta‐regression analyses and to tests for the evaluation of inconsistency. In numerical evaluations via simulations, the proposed methods generally performed well compared with the ordinary restricted maximum likelihood–based inference method. Applications to 2 network meta‐analysis datasets are provided.  相似文献   

14.
Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second‐order penalized quasi‐likelihood estimation (PQL). Starting from first‐order marginal quasi‐likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second‐order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed‐form formulas for sample size calculation are based on first‐order MQL, planning a trial also requires a conversion factor to obtain the variance of the second‐order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

15.
Analysis of population‐based case–control studies with complex sampling designs is challenging because the sample selection probabilities (and, therefore, the sample weights) depend on the response variable and covariates. Commonly, the design‐consistent (weighted) estimators of the parameters of the population regression model are obtained by solving (sample) weighted estimating equations. Weighted estimators, however, are known to be inefficient when the weights are highly variable as is typical for case–control designs. In this paper, we propose two alternative estimators that have higher efficiency and smaller finite sample bias compared with the weighted estimator. Both methods incorporate the information included in the sample weights by modeling the sample expectation of the weights conditional on design variables. We discuss benefits and limitations of each of the two proposed estimators emphasizing efficiency and robustness. We compare the finite sample properties of the two new estimators and traditionally used weighted estimators with the use of simulated data under various sampling scenarios. We apply the methods to the U.S. Kidney Cancer Case‐Control Study to identify risk factors. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

16.
In cluster randomized trials, the study units usually are not a simple random sample from some clearly defined target population. Instead, the target population tends to be hypothetical or ill‐defined, and the selection of study units tends to be systematic, driven by logistical and practical considerations. As a result, the population average treatment effect (PATE) may be neither well defined nor easily interpretable. In contrast, the sample average treatment effect (SATE) is the mean difference in the counterfactual outcomes for the study units. The sample parameter is easily interpretable and arguably the most relevant when the study units are not sampled from some specific super‐population of interest. Furthermore, in most settings, the sample parameter will be estimated more efficiently than the population parameter. To the best of our knowledge, this is the first paper to propose using targeted maximum likelihood estimation (TMLE) for estimation and inference of the sample effect in trials with and without pair‐matching. We study the asymptotic and finite sample properties of the TMLE for the sample effect and provide a conservative variance estimator. Finite sample simulations illustrate the potential gains in precision and power from selecting the sample effect as the target of inference. This work is motivated by the Sustainable East Africa Research in Community Health (SEARCH) study, a pair‐matched, community randomized trial to estimate the effect of population‐based HIV testing and streamlined ART on the 5‐year cumulative HIV incidence (NCT01864603). The proposed methodology will be used in the primary analysis for the SEARCH trial. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

17.
Joint effects of genetic and environmental factors have been increasingly recognized in the development of many complex human diseases. Despite the popularity of case‐control and case‐only designs, longitudinal cohort studies that can capture time‐varying outcome and exposure information have long been recommended for gene–environment (G × E) interactions. To date, literature on sampling designs for longitudinal studies of G × E interaction is quite limited. We therefore consider designs that can prioritize a subsample of the existing cohort for retrospective genotyping on the basis of currently available outcome, exposure, and covariate data. In this work, we propose stratified sampling based on summaries of individual exposures and outcome trajectories and develop a full conditional likelihood approach for estimation that adjusts for the biased sample. We compare the performance of our proposed design and analysis with combinations of different sampling designs and estimation approaches via simulation. We observe that the full conditional likelihood provides improved estimates for the G × E interaction and joint exposure effects over uncorrected complete‐case analysis, and the exposure enriched outcome trajectory dependent design outperforms other designs in terms of estimation efficiency and power for detection of the G × E interaction. We also illustrate our design and analysis using data from the Normative Aging Study, an ongoing longitudinal cohort study initiated by the Veterans Administration in 1963. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

18.
The case–cohort study design has often been used in studies of a rare disease or for a common disease with some biospecimens needing to be preserved for future studies. A case–cohort study design consists of a random sample, called the subcohort, and all or a portion of the subjects with the disease of interest. One advantage of the case–cohort design is that the same subcohort can be used for studying multiple diseases. Stratified random sampling is often used for the subcohort. Additive hazards models are often preferred in studies where the risk difference, instead of relative risk, is of main interest. Existing methods do not use the available covariate information fully. We propose a more efficient estimator by making full use of available covariate information for the additive hazards model with data from a stratified case–cohort design with rare (the traditional situation) and non‐rare (the generalized situation) diseases. We propose an estimating equation approach with a new weight function. The proposed estimators are shown to be consistent and asymptotically normally distributed. Simulation studies show that the proposed method using all available information leads to efficiency gain and stratification of the subcohort improves efficiency when the strata are highly correlated with the covariates. Our proposed method is applied to data from the Atherosclerosis Risk in Communities study. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
In the past decade, many genome‐wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case‐control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case‐control data for exploring genetic associations of some additional traits (secondary phenotypes, Y ) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non‐random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D, even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM‐PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM‐PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y , X, and D. Finally, we use POM‐PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case‐control sample from the population‐based Metabolic Syndrome in Men (METSIM) study. Only POM‐PS analysis of the T2D case‐control sample seems to provide valid association signals.  相似文献   

20.
In individually randomised controlled trials, adjustment for baseline characteristics is often undertaken to increase precision of the treatment effect estimate. This is usually performed using covariate adjustment in outcome regression models. An alternative method of adjustment is to use inverse probability‐of‐treatment weighting (IPTW), on the basis of estimated propensity scores. We calculate the large‐sample marginal variance of IPTW estimators of the mean difference for continuous outcomes, and risk difference, risk ratio or odds ratio for binary outcomes. We show that IPTW adjustment always increases the precision of the treatment effect estimate. For continuous outcomes, we demonstrate that the IPTW estimator has the same large‐sample marginal variance as the standard analysis of covariance estimator. However, ignoring the estimation of the propensity score in the calculation of the variance leads to the erroneous conclusion that the IPTW treatment effect estimator has the same variance as an unadjusted estimator; thus, it is important to use a variance estimator that correctly takes into account the estimation of the propensity score. The IPTW approach has particular advantages when estimating risk differences or risk ratios. In this case, non‐convergence of covariate‐adjusted outcome regression models frequently occurs. Such problems can be circumvented by using the IPTW adjustment approach. © 2013 The authors. Statistics in Medicine published by John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号