首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Observational outcome analyses appear frequently in the health research literature. For such analyses, clinical registries are preferred to administrative databases. Missing data are a common problem in any clinical registry, and pose a threat to the validity of observational outcomes analyses. Faced with missing data in a new clinical registry, we compared three possible responses: exclude cases with missing data; assume that the missing data indicated absence of risk; or merge the clinical database with an existing administrative database. The predictive model derived using the merged data showed a higher C statistic (C = 0.770), better model goodness-of-fit as measured in a decile-of-risk analysis, the largest gradient of risk across deciles (46.3), and the largest decrease in deviance (-2 log likelihood = 406.2). The superior performance of the enhanced data model supports the use of this "enhancement" methodology and bears consideration when researchers are faced with nonrandom missing data.  相似文献   

3.
Missing outcome data is a crucial threat to the validity of treatment effect estimates from randomized trials. The outcome distributions of participants with missing and observed data are often different, which increases bias. Causal inference methods may aid in reducing the bias and improving efficiency by incorporating baseline variables into the analysis. In particular, doubly robust estimators incorporate 2 nuisance parameters: the outcome regression and the missingness mechanism (ie, the probability of missingness conditional on treatment assignment and baseline variables), to adjust for differences in the observed and unobserved groups that can be explained by observed covariates. To consistently estimate the treatment effect, one of these nuisance parameters must be consistently estimated. Traditionally, nuisance parameters are estimated using parametric models, which often precludes consistency, particularly in moderate to high dimensions. Recent research on missing data has focused on data‐adaptive estimation to help achieve consistency, but the large sample properties of such methods are poorly understood. In this article, we discuss a doubly robust estimator that is consistent and asymptotically normal under data‐adaptive estimation of the nuisance parameters. We provide a formula for an asymptotically exact confidence interval under minimal assumptions. We show that our proposed estimator has smaller finite‐sample bias compared to standard doubly robust estimators. We present a simulation study demonstrating the enhanced performance of our estimators in terms of bias, efficiency, and coverage of the confidence intervals. We present the results of an illustrative example: a randomized, double‐blind phase 2/3 trial of antiretroviral therapy in HIV‐infected persons.  相似文献   

4.
Zhou XH  Li SM 《Statistics in medicine》2006,25(16):2737-2761
In this paper, we considered a missing outcome problem in causal inferences for a randomized encouragement design study. We proposed both moment and maximum likelihood estimators for the marginal distributions of potential outcomes and the local complier average causal effect (CACE) parameter. We illustrated our methods in a randomized encouragement design study on the effectiveness of flu shots.  相似文献   

5.
The problem of missing data is frequently encountered in observational studies. We compared approaches to dealing with missing data. Three multiple imputation methods were compared with a method of enhancing a clinical database through merging with administrative data. The clinical database used for comparison contained information collected from 6,065 cardiac care patients in 1995 in the province of Alberta, Canada. The effectiveness of the different strategies was evaluated using measures of discrimination and goodness of fit for the 1995 data. The strategies were further evaluated by examining how well the models predicted outcomes in data collected from patients in 1996. In general, the different methods produced similar results, with one of the multiple imputation methods demonstrating a slight advantage. It is concluded that the choice of missing data strategy should be guided by statistical expertise and data resources.  相似文献   

6.
We extend the pattern‐mixture approach to handle missing continuous outcome data in longitudinal cluster randomized trials, which randomize groups of individuals to treatment arms, rather than the individuals themselves. Individuals who drop out at the same time point are grouped into the same dropout pattern. We approach extrapolation of the pattern‐mixture model by applying multilevel multiple imputation, which imputes missing values while appropriately accounting for the hierarchical data structure found in cluster randomized trials. To assess parameters of interest under various missing data assumptions, imputed values are multiplied by a sensitivity parameter, k, which increases or decreases imputed values. Using simulated data, we show that estimates of parameters of interest can vary widely under differing missing data assumptions. We conduct a sensitivity analysis using real data from a cluster randomized trial by increasing k until the treatment effect inference changes. By performing a sensitivity analysis for missing data, researchers can assess whether certain missing data assumptions are reasonable for their cluster randomized trial.  相似文献   

7.
8.
We explore the potential of Bayesian hierarchical modelling for the analysis of cluster randomized trials with binary outcome data, and apply the methods to a trial randomized by general practice. An approximate relationship is derived between the intracluster correlation coefficient (ICC) and the between-cluster variance used in a hierarchical logistic regression model. By constructing an informative prior for the ICC on the basis of available information, we are thus able implicitly to specify an informative prior for the between-cluster variance. The approach also provides us with a credible interval for the ICC for binary outcome data. Several approaches to constructing informative priors from empirical ICC values are described. We investigate the sensitivity of results to the prior specified and find that the estimate of intervention effect changes very little in this data set, while its interval estimate is more sensitive. The Bayesian approach allows us to assume distributions other than normality for the random effects used to model the clustering. This enables us to gain insight into the robustness of our parameter estimates to the classical normality assumption. In a model with a more complex variance structure, Bayesian methods can provide credible intervals for a difference between two variance components, in order for example to investigate whether the effect of intervention varies across clusters. We compare our results with those obtained from classical estimation, discuss the relative merits of the Bayesian framework, and conclude that the flexibility of the Bayesian approach offers some substantial advantages, although selection of prior distributions is not straightforward.  相似文献   

9.
Bayesian approaches to inference in cluster randomized trials have been investigated for normally distributed and binary outcome measures. However, relatively little attention has been paid to outcome measures which are counts of events. We discuss an extension of previously published Bayesian hierarchical models to count data, which usually can be assumed to be distributed according to a Poisson distribution. We develop two models, one based on the traditional rate ratio, and one based on the rate difference which may often be more intuitively interpreted for clinical trials, and is needed for economic evaluation of interventions. We examine the relationship between the intracluster correlation coefficient (ICC) and the between‐cluster variance for each of these two models. In practice, this allows one to use the previously published evidence on ICCs to derive an informative prior distribution which can then be used to increase the precision of the posterior distribution of the ICC. We demonstrate our models using a previously published trial assessing the effectiveness of an educational intervention and a prior distribution previously derived. We assess the robustness of the posterior distribution for effectiveness to departures from a normal distribution of the random effects. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

10.
目的探讨适用于非劣效试验设计下重复测量数据的统计分析方法。方法基于某临床研究的基本指标,应用Monte Carlo模拟方法,分别模拟出完整的重复测量数据和不同缺失机制下不同缺失比例的数据;运用不同的统计分析方法对模拟数据进行统计分析,通过检验效能、Ⅰ类错误等指标比较不同分析方法的性能。结果完整数据下,多数的统计分析方法均能达到或近似达到较理想的检验效能和Ⅰ类错误;存在数据缺失的情况下,基于非结构化协方差矩阵的重复测量混合效应模型(mixed-effect model repeated measure,MMRM)和不同协方差结构的广义估计方程(generalized estimated equation,GEE)表现出良好的稳定性,能在保证足够检验效能的情况下,有效的控制Ⅰ类错误;非劣效试验设计下,不同缺失比例的模拟结果均显示,末次结转(last observation carried forward,LOCF)的方法会低估组间的差异,造成Ⅰ类错误膨胀。结论基于非劣效试验设计的重复测量数据,当数据存在缺失时,LOCF结转方法会低估组间的差异,造成I类错误的膨胀,不再是一种保守的缺失值处理...  相似文献   

11.
With new treatments and novel technology available, personalized medicine has become an important piece in the new era of medical product development. Traditional statistics methods for personalized medicine and subgroup identification primarily focus on single treatment or two arm randomized control trials. Motivated by the recent development of outcome weighted learning framework, we propose an alternative algorithm to search treatment assignments which has a connection with subgroup identification problems. Our method focuses on applications from clinical trials to generate easy to interpret results. This framework is able to handle two or more than two treatments from both randomized control trials and observational studies. We implement our algorithm in C++ and connect it with R. Its performance is evaluated by simulations, and we apply our method to a dataset from a diabetes study. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

12.
本文主要讨论流行病学观察性研究中几种常用的似定,介绍Simpson’s悖论和Lord’s悖论,逆回归问题,以及任小同假定下可能出现的相互矛盾的结果。探讨流行病学研究的因果推断中处理分配机制的假定,暴露作用评价中对照组的模型假定,以及不完全数据分析中缺失数据机制的假定。  相似文献   

13.
Because of current techniques of determining gene mutation, investigators are now interested in estimating the odds ratio between genetic status (mutation, no mutation) and an outcome variable such as disease cell type (A, B). In this paper we consider the mutation of the RAS genetic family. To determine if the genes have mutated, investigators look at five specific locations on the RAS gene. RAS mutated is a mutation in at least one of the five gene locations and RAS non-mutated is no mutation in any of the five locations. Owing to limited time and financial resources, one cannot obtain a complete genetic evaluation of all five locations on the gene for all patients. We propose the use of maximum likelihood (ML) with a 2(6) multinomial distribution formed by cross-classifying the binary mutation status at five locations by binary disease cell type. This ML method includes all patients regardless of completeness of data, treats the locations not evaluated as missing data, and uses the EM algorithm to estimate the odds ratio between genetic mutation status and the disease type. We compare the ML method to complete case estimates, and a method used by clinical investigators, which excludes patients with data on less than five locations who have no mutations on these sites.  相似文献   

14.
BACKGROUND: Many cohort studies and clinical trials use repeated measurements of laboratory markers to track disease progression and to evaluate new therapies. A major problem in the analysis of such studies is that marker data are censored in some patients owing to withdrawal, loss to follow-up, or death. The objective of this paper is to evaluate the impact of selective dropouts attributable to death or disease progression on the estimates of marker change among different groups. METHODS: Data on CD4 cell count in human immunodeficiency virus 1-infected individuals from a clinical trial and a cohort study are used to illustrate this problem and a possible solution. Simulation studies are also presented. RESULTS: When the rate of dropout is greater in subjects whose marker status is declining rapidly, commonly used methods, like random effects models, that ignore informative dropouts lead to overoptimistic statements about the marker trends in all compared groups, because subjects with steeper marker drops tend to have shorter follow-up times and hence are weighted less in the estimation of the group rate of the average marker decline. CONCLUSIONS: The potential biases attributable to incomplete data require greater recognition in longitudinal studies. Sensitivity analyses to assess the effect of dropouts are important.  相似文献   

15.
Estimating causal effects in psychiatric clinical trials is often complicated by treatment non-compliance and missing outcomes. While new estimators have recently been proposed to address these problems, they do not allow for inclusion of continuous covariates. We propose estimators that adjust for continuous covariates in addition to non-compliance and missing data. Using simulations, we compare mean squared errors for the new estimators with those of previously established estimators. We then illustrate our findings in a study examining the efficacy of clozapine versus haloperidol in the treatment of refractory schizophrenia. For data with continuous or binary outcomes in the presence of non-compliance, non-ignorable missing data, and a covariate effect, the new estimators generally performed better than the previously established estimators. In the clozapine trial, the new estimators gave point and interval estimates similar to established estimators. We recommend the new estimators as they are unbiased even when outcomes are not missing at random and they are more efficient than established estimators in the presence of covariate effects under the widest variety of circumstances.  相似文献   

16.
Adjustment for baseline variables in a randomized trial can increase power to detect a treatment effect. However, when baseline data are partly missing, analysis of complete cases is inefficient. We consider various possible improvements in the case of normally distributed baseline and outcome variables. Joint modelling of baseline and outcome is the most efficient method. Mean imputation is an excellent alternative, subject to three conditions. Firstly, if baseline and outcome are correlated more than about 0.6 then weighting should be used to allow for the greater information from complete cases. Secondly, imputation should be carried out in a deterministic way, using other baseline variables if possible, but not using randomized arm or outcome. Thirdly, if baselines are not missing completely at random, then a dummy variable for missingness should be included as a covariate (the missing indicator method). The methods are illustrated in a randomized trial in community psychiatry.  相似文献   

17.
One difficulty in performing meta‐analyses of observational cohort studies is that the availability of confounders may vary between cohorts, so that some cohorts provide fully adjusted analyses while others only provide partially adjusted analyses. Commonly, analyses of the association between an exposure and disease either are restricted to cohorts with full confounder information, or use all cohorts but do not fully adjust for confounding. We propose using a bivariate random‐effects meta‐analysis model to use information from all available cohorts while still adjusting for all the potential confounders. Our method uses both the fully adjusted and the partially adjusted estimated effects in the cohorts with full confounder information, together with an estimate of their within‐cohort correlation. The method is applied to estimate the association between fibrinogen level and coronary heart disease incidence using data from 154 012 participants in 31 cohorts.? One hundred and ninety‐nine participants from the original 154 211 withdrew their consent and have been removed from this analysis. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
Selection criteria are specified in clinical trials to define the study population from which the sample will be obtained. It is common for one of these criteria to be based on historical or baseline measurements of the clinical sign or symptom that will serve as the response variable in the trial. The effect of such selection criteria has been studied extensively for normally distributed responses, but less is known about the situation in which the response is a count or a possibly recurrent event. In this paper we examine the bias and relative efficiency of some common methods of analysis for count data in the presence of selection criteria. The investigation is carried out using asymptotic theory pertaining to misspecified models and by simulation. Applications involving data from an epilepsy trial and a study of transient myocardial ischaemia illustrate the effect of ignoring the selection mechanism.  相似文献   

19.
BACKGROUND: The E3N Study, 'Etude Epidémiologique auprès de femmes de la Mutuelle Générale de l'Education Nationale', is a cohort study, aiming at studying cancer risk factors on 100,000 women. Even if the incidence of problematic (missing, incoherent, etc.) data is low, any multivariate analysis which would be based only on complete subjects would rely on a too small sample, which would not necessarily be representative of the studied population. Results could thus be biased. METHODS: Our dealing with problematic data includes RESULTS: We looked at the number of individuals on which an analysis on 19 variables could be undertaken. The management of missing data made exploitable one fourth of the cohort, i.e.74.6% of individuals instead of 50.5%. Moreover, for 89.0% of subjects, one variable at most (out of the 19 studied) has missing datum. CONCLUSIONS: The main difficulty does not stand so much in the choice and implementation of methods to deal with problematic data than in the identification of their process of existence. Most of what was gained was due to the simplest methods: cold-deck and deductive method.  相似文献   

20.
When studies in meta‐analysis include different sets of confounders, simple analyses can cause a bias (omitting confounders that are missing in certain studies) or precision loss (omitting studies with incomplete confounders, i.e. a complete‐case meta‐analysis). To overcome these types of issues, a previous study proposed modelling the high correlation between partially and fully adjusted regression coefficient estimates in a bivariate meta‐analysis. When multiple differently adjusted regression coefficient estimates are available, we propose exploiting such correlations in a graphical model. Compared with a previously suggested bivariate meta‐analysis method, such a graphical model approach is likely to reduce the number of parameters in complex missing data settings by omitting the direct relationships between some of the estimates. We propose a structure‐learning rule whose justification relies on the missingness pattern being monotone. This rule was tested using epidemiological data from a multi‐centre survey. In the analysis of risk factors for early retirement, the method showed a smaller difference from a complete data odds ratio and greater precision than a commonly used complete‐case meta‐analysis. Three real‐world applications with monotone missing patterns are provided, namely, the association between (1) the fibrinogen level and coronary heart disease, (2) the intima media thickness and vascular risk and (3) allergic asthma and depressive episodes. The proposed method allows for the inclusion of published summary data, which makes it particularly suitable for applications involving both microdata and summary data. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号