共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a three-step multiple imputation method, implemented by Gibbs sampler, for estimating parameters in non-linear mixed-effects models with missing covariates. Estimates obtained by the proposed multiple imputation method are compared to those obtained by the mean-value imputation method and the complete-case method through simulations. We find that the proposed multiple imputation method offers smaller biases and smaller mean-squared errors for the estimates of covariate coefficients compared to other two methods. We apply the three missing data methods to modelling HIV viral dynamics from an AIDS clinical trial. We believe that the results from the proposed multiple imputation method are more reliable than that from the other two commonly used methods. 相似文献
2.
A normal copula-based selection model is proposed for continuous longitudinal data with a non-ignorable non-monotone missing-data process. The normal copula is used to combine the distribution of the outcome of interest and that of the missing-data indicators given the covariates. Parameters in the model are estimated by a pseudo-likelihood method. We first use the GEE with a logistic link to estimate the parameters associated with the marginal distribution of the missing-data indicator given the covariates, assuming that covariates are always observed. Then we estimate other parameters by inserting the estimates from the first step into the full likelihood function. A simulation study is conducted to assess the robustness of the assumed model under different missing-data processes. The proposed method is then applied to one example from a community cohort study to demonstrate its capability to reduce bias. 相似文献
3.
Jos Twisk Michiel de Boer Wieke de Vente Martijn Heymans 《Journal of clinical epidemiology》2013,66(9):1022-1028
Background and ObjectivesAs a result of the development of sophisticated techniques, such as multiple imputation, the interest in handling missing data in longitudinal studies has increased enormously in past years. Within the field of longitudinal data analysis, there is a current debate on whether it is necessary to use multiple imputations before performing a mixed-model analysis to analyze the longitudinal data. In the current study this necessity is evaluated.Study Design and SettingThe results of mixed-model analyses with and without multiple imputation were compared with each other. Four data sets with missing values were created—one data set with missing completely at random, two data sets with missing at random, and one data set with missing not at random). In all data sets, the relationship between a continuous outcome variable and two different covariates were analyzed: a time-independent dichotomous covariate and a time-dependent continuous covariate.ResultsAlthough for all types of missing data, the results of the mixed-model analysis with or without multiple imputations were slightly different, they were not in favor of one of the two approaches. In addition, repeating the multiple imputations 100 times showed that the results of the mixed-model analysis with multiple imputation were quite unstable.ConclusionIt is not necessary to handle missing data using multiple imputations before performing a mixed-model analysis on longitudinal data. 相似文献
4.
Most currently available methods for detecting discordant subjects and observations in linear mixed effects model fits adapt existing methods for single‐level regression data. The most common methods are generalizations of deletion‐based approaches, primarily Cook's distance. This article describes the limitations of modifications to Cook's distance and local influence, and suggests a new nondeletion subject‐level method, studentized residual sum of squares (TRSS) plots. We also suggest a new observation‐level deletion method that detects discordant observations as an application of TRSS plots. The proposed method provides greater information on repeated measurements by utilizing revised residuals and efficiently evaluating the effect of discordant subjects and observations on the estimation of parameters including variance components. We compare the performance of the proposed methods with current methods by using the orthodontic growth data: a longitudinal dataset with 27 subjects each observed four times. TRSS plots successfully identified discordant subjects that were missed by modified Cook's distance methods and the local influence approach. Extensions of TRSS plots are also described. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
5.
Partial imputation approach to analysis of repeated measurements with dependent drop-outs 总被引:1,自引:0,他引:1
In clinical trials repeated measurements of a response variable are usually taken at prespecified time-points to compare the treatment effects. However, the comparison of treatment effects is often complicated by missing data caused by the withdrawal of some patients before the end of the study (that is, drop-outs). When the drop-out process depends on the response variable of interest, ignoring missing data may lead to biased comparison of the treatment effect. In this paper, conditions for ignoring the dependent missingness are investigated and a new approach using the usual testing procedure based on data with partial carrying-forward imputation is proposed. The proposed approach is conceptually and practically simple, and is motivated by making incremental improvement on the familiar 'all available data' (AAD) approach and the 'last value carrying forward' (LVCF) approach, which are commonly used in data analysis with drop-outs by practitioners. It is also compared favourably to the mixed-effect model approach with dependent drop-outs. Simulations and real data are used to evaluate and illustrate statistical properties of the proposed approach. The principle of the proposed approach can also be extended to using other imputation methods such as the multiple imputation. 相似文献
6.
Lindsey JK 《Statistics in medicine》2000,19(6):801-809
The most commonly used models for categorical repeated measurement data are log-linear models. Not only are they easy to fit with standard software but they include such useful models as Markov chains and graphical models. However, these are conditional models and one often also requires the marginal probabilities of responses, for example, at each time point in a longitudinal study. Here a simple method of matrix manipulation is used to derive the maximum likelihood estimates of the marginal probabilities from any such conditional categorical repeated measures model. The technique is applied to the classical Muscatine data set, taking into account the dependence of missingness on previous observed values, as well as serial dependence and a random effect. 相似文献
7.
Longitudinal studies with repeated measures are often subject to non-response. Methods currently employed to alleviate the difficulties caused by missing data are typically unsatisfactory, especially when the cause of the missingness is related to the outcomes. We present an approach for incomplete categorical data in the repeated measures setting that allows missing data to depend on other observed outcomes for a study subject. The proposed methodology also allows a broader examination of study findings through interpretation of results in the framework of the set of all possible test statistics that might have been observed had no data been missing. The proposed approach consists of the following general steps. First, we generate all possible sets of missing values and form a set of possible complete data sets. We then weight each data set according to clearly defined assumptions and apply an appropriate statistical test procedure to each data set, combining the results to give an overall indication of significance. We make use of the EM algorithm and a Bayesian prior in this approach. While not restricted to the one-sample case, the proposed methodology is illustrated for one-sample data and compared to the common complete-case and available-case analysis methods. 相似文献
8.
This paper studies a non-response problem in survival analysis where the occurrence of missing data in the risk factor is related to mortality. In a study to determine the influence of blood pressure on survival in the very old (85+ years), blood pressure measurements are missing in about 12.5 per cent of the sample. The available data suggest that the process that created the missing data depends jointly on survival and the unknown blood pressure, thereby distorting the relation of interest. Multiple imputation is used to impute missing blood pressure and then analyse the data under a variety of non-response models. One special modelling problem is treated in detail; the construction of a predictive model for drawing imputations if the number of variables is large. Risk estimates for these data appear robust to even large departures from the simplest non-response model, and are similar to those derived under deletion of the incomplete records. 相似文献
9.
有序多分类重复测量资料的广义线性混合效应模型分析 总被引:1,自引:0,他引:1
目的 探讨广义线性混合效应模型在有序多分类重复测量资料分析中的应用及SAS9.1的GLIMMIX和NLMIXED过程实现.方法 为了评价某新药治疗糖尿病神经病变的临床疗效,采用以安慰剂为对照的随机双盲临床试验.在各个随访时间记录各受试者的神经病变主觉症状总分,并根据减分率评定疗效.建立广义线性混合效应模型,并分别用线性化法和数值法积分近似法进行参数估计,利用SAS中的GLIMMIX和NLMIXED过程得以实现.结果 2种参数估计方法 结果 很接近.疗效的组间差别有统计学意义(P〈0.000 1),试验组疗效优于安慰剂组;各个疗程间的疗效差别有统计学意义(P〈0.000 1),且疗程越大疗效越好; 治疗前神经病变主觉症状总分对疗效有影响(P=0.061 3,接近显著性水平),其值越高,越容易治愈,提示病情严重的患者相比病情轻微的患者治愈效果更好.另外用数值法积分近似法还给出了随机截距和随机斜率的统计显著性检验.结论 采用广义线性混合效应模型对有序多分类重复测量临床资料进行统计分析,可以更客观的进行药物疗效评价. 相似文献
10.
Zosia Beckles Sarah Glover Joanna Ashe Sarah Stockton Janette Boynton Rosalind Lai Philip Alderson 《Journal of clinical epidemiology》2013,66(9):1051-1057
ObjectivesThis study aims to quantify the unique useful yield from the Cumulative Index to Nursing and Allied Health Literature (CINAHL) database to National Institute for Health and Clinical Excellence (NICE) clinical guidelines. A secondary objective is to investigate the relationship between this yield and different clinical question types. It is hypothesized that the unique useful yield from CINAHL is low, and this database can therefore be relegated to selective rather than routine searching.Study Design and SettingA retrospective sample of 15 NICE guidelines published between 2005 and 2009 was taken. Information on clinical review question type, number of references, and reference source was extracted.ResultsOnly 0.33% (95% confidence interval: 0.01–0.64%) of references per guideline were unique to CINAHL. Nursing- or allied health (AH)–related questions were nearly three times as likely to have references unique to CINAHL as non–nursing- or AH-related questions (14.89% vs. 5.11%), and this relationship was found to be significant (P < 0.05). No significant relationship was found between question type and unique CINAHL yield for drug-related questions.ConclusionsThe very low proportion of references unique to CINAHL strongly suggests that this database can be safely relegated to selective rather than routine searching. Nursing- and AH-related questions would benefit from selective searching of CINAHL. 相似文献
11.
Multiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses. 总被引:3,自引:0,他引:3
Peter D Faris William A Ghali Rollin Brant Colleen M Norris P Diane Galbraith Merril L Knudtson 《Journal of clinical epidemiology》2002,55(2):184-191
The problem of missing data is frequently encountered in observational studies. We compared approaches to dealing with missing data. Three multiple imputation methods were compared with a method of enhancing a clinical database through merging with administrative data. The clinical database used for comparison contained information collected from 6,065 cardiac care patients in 1995 in the province of Alberta, Canada. The effectiveness of the different strategies was evaluated using measures of discrimination and goodness of fit for the 1995 data. The strategies were further evaluated by examining how well the models predicted outcomes in data collected from patients in 1996. In general, the different methods produced similar results, with one of the multiple imputation methods demonstrating a slight advantage. It is concluded that the choice of missing data strategy should be guided by statistical expertise and data resources. 相似文献
12.
BACKGROUND AND OBJECTIVE: Epidemiologic studies commonly estimate associations between predictors (risk factors) and outcome. Most software automatically exclude subjects with missing values. This commonly causes bias because missing values seldom occur completely at random (MCAR) but rather selectively based on other (observed) variables, missing at random (MAR). Multiple imputation (MI) of missing predictor values using all observed information including outcome is advocated to deal with selective missing values. This seems a self-fulfilling prophecy. METHODS: We tested this hypothesis using data from a study on diagnosis of pulmonary embolism. We selected five predictors of pulmonary embolism without missing values. Their regression coefficients and standard errors (SEs) estimated from the original sample were considered as "true" values. We assigned missing values to these predictors--both MCAR and MAR--and repeated this 1,000 times using simulations. Per simulation we multiple imputed the missing values without and with the outcome, and compared the regression coefficients and SEs to the truth. RESULTS: Regression coefficients based on MI including outcome were close to the truth. MI without outcome yielded very biased--underestimated--coefficients. SEs and coverage of the 90% confidence intervals were not different between MI with and without outcome. Results were the same for MCAR and MAR. CONCLUSION: For all types of missing values, imputation of missing predictor values using the outcome is preferred over imputation without outcome and is no self-fulfilling prophecy. 相似文献
13.
Lauren J. Beesley Jonathan W. Bartlett Gregory T. Wolf Jeremy M. G. Taylor 《Statistics in medicine》2016,35(26):4701-4717
We explore several approaches for imputing partially observed covariates when the outcome of interest is a censored event time and when there is an underlying subset of the population that will never experience the event of interest. We call these subjects ‘cured’, and we consider the case where the data are modeled using a Cox proportional hazards (CPH) mixture cure model. We study covariate imputation approaches using fully conditional specification. We derive the exact conditional distribution and suggest a sampling scheme for imputing partially observed covariates in the CPH cure model setting. We also propose several approximations to the exact distribution that are simpler and more convenient to use for imputation. A simulation study demonstrates that the proposed imputation approaches outperform existing imputation approaches for survival data without a cure fraction in terms of bias in estimating CPH cure model parameters. We apply our multiple imputation techniques to a study of patients with head and neck cancer. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
14.
Multiple imputation to account for missing data in a survey: estimating the prevalence of osteoporosis 总被引:6,自引:0,他引:6
BACKGROUND: Nonresponse bias is a concern in any epidemiologic survey in which a subset of selected individuals declines to participate. METHODS: We reviewed multiple imputation, a widely applicable and easy to implement Bayesian methodology to adjust for nonresponse bias. To illustrate the method, we used data from the Canadian Multicentre Osteoporosis Study, a large cohort study of 9423 randomly selected Canadians, designed in part to estimate the prevalence of osteoporosis. Although subjects were randomly selected, only 42% of individuals who were contacted agreed to participate fully in the study. The study design included a brief questionnaire for those invitees who declined further participation in order to collect information on the major risk factors for osteoporosis. These risk factors (which included age, sex, previous fractures, family history of osteoporosis, and current smoking status) were then used to estimate the missing osteoporosis status for nonparticipants using multiple imputation. Both ignorable and nonignorable imputation models are considered. RESULTS: Our results suggest that selection bias in the study is of concern, but only slightly, in very elderly (age 80+ years), both women and men. CONCLUSIONS: Epidemiologists should consider using multiple imputation more often than is current practice. 相似文献
15.
Andreea Monica Rawlings Yingying Sang Albert Richey Sharrett Josef Coresh Michael Griswold Anna Maria Kucharska-Newton Priya Palta Lisa Miller Wruck Alden Lawrence Gross Jennifer Anne Deal Melinda Carolyn Power Karen Jean Bandeen-Roche 《European journal of epidemiology》2017,32(1):55-66
Longitudinal studies of cognitive performance are sensitive to dropout, as participants experiencing cognitive deficits are less likely to attend study visits, which may bias estimated associations between exposures of interest and cognitive decline. Multiple imputation is a powerful tool for handling missing data, however its use for missing cognitive outcome measures in longitudinal analyses remains limited. We use multiple imputation by chained equations (MICE) to impute cognitive performance scores of participants who did not attend the 2011–2013 exam of the Atherosclerosis Risk in Communities Study. We examined the validity of imputed scores using observed and simulated data under varying assumptions. We examined differences in the estimated association between diabetes at baseline and 20-year cognitive decline with and without imputed values. Lastly, we discuss how different analytic methods (mixed models and models fit using generalized estimate equations) and choice of for whom to impute result in different estimands. Validation using observed data showed MICE produced unbiased imputations. Simulations showed a substantial reduction in the bias of the 20-year association between diabetes and cognitive decline comparing MICE (3–4 % bias) to analyses of available data only (16–23 % bias) in a construct where missingness was strongly informative but realistic. Associations between diabetes and 20-year cognitive decline were substantially stronger with MICE than in available-case analyses. Our study suggests when informative data are available for non-examined participants, MICE can be an effective tool for imputing cognitive performance and improving assessment of cognitive decline, though careful thought should be given to target imputation population and analytic model chosen, as they may yield different estimands. 相似文献
16.
We propose a nonparametric approach for cumulative incidence estimation when causes of failure are unknown or missing for some subjects. Under the missing at random assumption, we estimate the cumulative incidence function using multiple imputation methods. We develop asymptotic theory for the cumulative incidence estimators obtained from multiple imputation methods. We also discuss how to construct confidence intervals for the cumulative incidence function and perform a test for comparing the cumulative incidence functions in two samples with missing cause of failure. Through simulation studies, we show that the proposed methods perform well. The methods are illustrated with data from a randomized clinical trial in early stage breast cancer. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
17.
Benmei Liu Mandi Yu Barry I. Graubard Richard P. Troiano Nathaniel Schenker 《Statistics in medicine》2016,35(28):5170-5188
The Physical Activity Monitor component was introduced into the 2003–2004 National Health and Nutrition Examination Survey (NHANES) to collect objective information on physical activity including both movement intensity counts and ambulatory steps. Because of an error in the accelerometer device initialization process, the steps data were missing for all participants in several primary sampling units, typically a single county or group of contiguous counties, who had intensity count data from their accelerometers. To avoid potential bias and loss in efficiency in estimation and inference involving the steps data, we considered methods to accurately impute the missing values for steps collected in the 2003–2004 NHANES. The objective was to come up with an efficient imputation method that minimized model‐based assumptions. We adopted a multiple imputation approach based on additive regression, bootstrapping and predictive mean matching methods. This method fits alternative conditional expectation (ace) models, which use an automated procedure to estimate optimal transformations for both the predictor and response variables. This paper describes the approaches used in this imputation and evaluates the methods by comparing the distributions of the original and the imputed data. A simulation study using the observed data is also conducted as part of the model diagnostics. Finally, some real data analyses are performed to compare the before and after imputation results. Published 2016. This article is a U.S. Government work and is in the public domain in the USA. 相似文献
18.
Review: a gentle introduction to imputation of missing values 总被引:1,自引:0,他引:1
Donders AR van der Heijden GJ Stijnen T Moons KG 《Journal of clinical epidemiology》2006,59(10):1087-1091
In most situations, simple techniques for handling missing data (such as complete case analysis, overall mean imputation, and the missing-indicator method) produce biased results, whereas imputation techniques yield valid results without complicating the analysis once the imputations are carried out. Imputation techniques are based on the idea that any subject in a study sample can be replaced by a new randomly chosen subject from the same source population. Imputation of missing data on a variable is replacing that missing by a value that is drawn from an estimate of the distribution of this variable. In single imputation, only one estimate is used. In multiple imputation, various estimates are used, reflecting the uncertainty in the estimation of this distribution. Under the general conditions of so-called missing at random and missing completely at random, both single and multiple imputations result in unbiased estimates of study associations. But single imputation results in too small estimated standard errors, whereas multiple imputation results in correctly estimated standard errors and confidence intervals. In this article we explain why all this is the case, and use a simple simulation study to demonstrate our explanations. We also explain and illustrate why two frequently used methods to handle missing data, i.e., overall mean imputation and the missing-indicator method, almost always result in biased estimates. 相似文献
19.
Schenker N Borrud LG Burt VL Curtin LR Flegal KM Hughes J Johnson CL Looker AC Mirel L 《Statistics in medicine》2011,30(3):260-276
In 1999, dual-energy x-ray absorptiometry (DXA) scans were added to the National Health and Nutrition Examination Survey (NHANES) to provide information on soft tissue composition and bone mineral content. However, in 1999-2004, DXA data were missing in whole or in part for about 21 per cent of the NHANES participants eligible for the DXA examination; and the missingness is associated with important characteristics such as body mass index and age. To handle this missing-data problem, multiple imputation of the missing DXA data was performed. Several features made the project interesting and challenging statistically, including the relationship between missingness on the DXA measures and the values of other variables; the highly multivariate nature of the variables being imputed; the need to transform the DXA variables during the imputation process; the desire to use a large number of non-DXA predictors, many of which had small amounts of missing data themselves, in the imputation models; the use of lower bounds in the imputation procedure; and relationships between the DXA variables and other variables, which helped both in creating and evaluating the imputations. This paper describes the imputation models, methods, and evaluations for this publicly available data resource and demonstrates properties of the imputations via examples of analyses of the data. The analyses suggest that imputation helps to correct biases that occur in estimates based on the data without imputation, and that it helps to increase the precision of estimates as well. Moreover, multiple imputation usually yields larger estimated standard errors than those obtained with single imputation. 相似文献
20.
Matthieu Resche‐Rigon Ian R. White JonathanW. Bartlett Sanne A.E. Peters Simon G. Thompson 《Statistics in medicine》2013,32(28):4890-4905
A variable is ‘systematically missing’ if it is missing for all individuals within particular studies in an individual participant data meta‐analysis. When a systematically missing variable is a potential confounder in observational epidemiology, standard methods either fail to adjust the exposure–disease association for the potential confounder or exclude studies where it is missing. We propose a new approach to adjust for systematically missing confounders based on multiple imputation by chained equations. Systematically missing data are imputed via multilevel regression models that allow for heterogeneity between studies. A simulation study compares various choices of imputation model. An illustration is given using data from eight studies estimating the association between carotid intima media thickness and subsequent risk of cardiovascular events. Results are compared with standard methods and also with an extension of a published method that exploits the relationship between fully adjusted and partially adjusted estimated effects through a multivariate random effects meta‐analysis model. We conclude that multiple imputation provides a practicable approach that can handle arbitrary patterns of systematic missingness. Bias is reduced by including sufficient between‐study random effects in the imputation model. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献