首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
BACKGROUND: Using an application and a simulation study we show the bias induced by missing data in the outcome in longitudinal studies and discuss suitable statistical methods according to the type of missing responses when the variable under study is gaussian. Method: The model used for the analysis of gaussian longitudinal data is the mixed effects linear model. When the probability of response does not depend on the missing values of the outcome and on the parameters of the linear model, missing data are ignorable, and parameters of the mixed effects linear model may be estimated by the maximum likelihood method with classical softwares. When the missing data are non ignorable, several methods have been proposed. We describe the method proposed by Diggle and Kenward (1994) (DK method) for which a software is available. This model consists in the combination of a linear mixed effects model for the outcome variable and a logistic model for the probability of response which depends on the outcome variable. RESULTS: A simulation study shows the efficacy of this method and its limits when the data are not normal. In this case, estimators obtained by the DK approach may be more biased than estimators obtained under the hypothesis of ignorable missing data even if the data are non ignorable. Data of the Paquid cohort about the evolution of the scores to a neuropsychological test among elderly subjects show the bias of a naive analysis using all available data. Although missing responses are not ignorable in this study, estimates of the linear mixed effects model are not very different using the DK approach and the hypothesis of ignorable missing data. CONCLUSION: Statistical methods for longitudinal data including non ignorable missing responses are sensitive to hypotheses difficult to verify. Thus, it will be better in practical applications to perform an analysis under the hypothesis of ignorable missing responses and compare the results obtained with several approaches for non ignorable missing data. However, such a strategy requires development of new softwares.  相似文献   

3.
4.
In surveys with multiple waves of follow-up, nonrespondents to the first wave are sometimes followed intensively but this does not guarantee an increase in the response rate or an appreciable change in the estimate of interest. Most prior research has focused on stopping rules for Phase I clinical trials. To our knowledge there are no standard methods to stop follow-up in observational studies. Previous research suggests optimal stopping strategies where decisions are based on achieving a given precision for minimum cost or reducing cost for a given precision. In this paper, we propose three stopping rules that are based on assessing whether successive waves of sampling provide evidence that the parameter of interest is changing. Two of the rules rely on examining patterns of observed responses while the third rule uses missing data methods to multiply impute missing responses. We also present results from a simulation study to evaluate our proposed methods. Our simulations suggest that rules that adjust for nonresponse are preferred for decisions to discontinue follow-up since they reduce bias in the estimate of interest. The rules are not complicated and may be applied in a straightforward manner. Discontinuing follow-up would save time and possibly resources, and adjusting for the nonresponse in the analysis would reduce the impact of nonresponse bias.  相似文献   

5.
When missing data occur in one or more covariates in a regression model, multiple imputation (MI) is widely advocated as an improvement over complete‐case analysis (CC). We use theoretical arguments and simulation studies to compare these methods with MI implemented under a missing at random assumption. When data are missing completely at random, both methods have negligible bias, and MI is more efficient than CC across a wide range of scenarios. For other missing data mechanisms, bias arises in one or both methods. In our simulation setting, CC is biased towards the null when data are missing at random. However, when missingness is independent of the outcome given the covariates, CC has negligible bias and MI is biased away from the null. With more general missing data mechanisms, bias tends to be smaller for MI than for CC. Since MI is not always better than CC for missing covariate problems, the choice of method should take into account what is known about the missing data mechanism in a particular substantive application. Importantly, the choice of method should not be based on comparison of standard errors. We propose new ways to understand empirical differences between MI and CC, which may provide insights into the appropriateness of the assumptions underlying each method, and we propose a new index for assessing the likely gain in precision from MI: the fraction of incomplete cases among the observed values of a covariate (FICO). Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

6.
Background Several studies of long-term adjustment in childhood cancer survivors (CCS) report very positive outcomes, while other studies find significant adjustment problems. These inconsistencies have prompted some investigators to suggest survivors may be biased responders, prone to underreporting on self-report measures. This study tested the hypothesis that CCS are elevated on self-deception response bias (SDRB), and that SDRB is associated with higher ratings of quality-of-life (QOL). Methods One hundred and seven adult (mean age = 31.85) survivors of childhood cancers completed a demographic questionnaire, Short Form-12 (SF-12), Functional Assessment of Cancer Therapy-General (FACT-G), and Self-Deception Enhancement scale (SDE), an SDRB measure. Results Survivors’ QOL scores were similar to normative groups, but they evidenced much higher levels of response bias. SDE scores were significantly correlated with the FACT-G, and SF-12 Mental Health (but not Physical Health) scores even after accounting for demographic and treatment-related variables. Conclusions CCS show a biased response style, indicating a systematic tendency to deny difficulties on QOL measures. This may complicate QOL studies by inflating survivors’ reports of their socio-emotional functioning. Understanding how response bias develops may help us learn more about cancer survivors’ adaptation to illness, and the effects of the illness experience on their perceptions of QOL.  相似文献   

7.
Fairclough  D.L.  Gagnon  D.D.  Zagari  M.J.  Marschner  N.  Dicato  M. 《Quality of life research》2003,12(8):1013-1027
Quality of life (QOL) endpoints from a randomized, placebo-controlled trial of anemic cancer patients treated with nonplatinum-containing chemotherapy who received epoetin alfa or placebo were subjected to a sensitivity analysis. Three QOL instruments were used: the Functional Assessment of Cancer Therapy-Anemia (FACT-An), the Cancer Linear Analog Scale (CLAS), and the Medical Outcomes Study Short Form-36 (SF-36). The seven primary endpoints chosen a priori for analysis were: the Functional Assessment of Cancer Therapy-General (FACT-G) Total, FACT-An fatigue subscale, CLAS energy, CLAS daily activities, CLAS overall QOL, and the SF-36 physical and mental component summary scales. Lower QOL scores were reported for patients who discontinued early, suggesting a nonrandom dropout process. Significant correlations (ranging from 0.37 to 0.77) between individual rates of change and the time to early termination of therapy or death supported this conclusion. Estimates of within-treatment-arm QOL change over time are more conservative with the missing not at random (MNAR) assumption as compared with the more optimistic estimates with the assumption that missing QOL data are missing at random (MAR). However, the between-treatment-arm comparisons were consistent across analyses, demonstrating statistically significant differences in favor of the epoetin alfa arm for four of the seven outcome measures.  相似文献   

8.
ABSTRACT: BACKGROUND: Multiple imputation is becoming increasingly popular for handling missing data. However, it is often implemented without adequate consideration of whether it offers any advantage over complete case analysis for the research question of interest, or whether potential gains may be offset by bias from a poorly fitting imputation model, particularly as the amount of missing data increases. METHODS: Simulated datasets (n = 1000) drawn from a synthetic population were used to explore information recovery from multiple imputation in estimating the coefficient of a binary exposure variable when various proportions of data (10-90%) were set missing at random in a highly-skewed continuous covariate or in the binary exposure. Imputation was performed using multivariate normal imputation (MVNI), with a simple or zero-skewness log transformation to manage non-normality. Bias, precision, mean-squared error and coverage for a set of regression parameter estimates were compared between multiple imputation and complete case analyses. RESULTS: For missingness in the continuous covariate, multiple imputation produced less bias and greater precision for the effect of the binary exposure variable, compared with complete case analysis, with larger gains in precision with more missing data. However, even with only moderate missingness, large bias and substantial under-coverage were apparent in estimating the continuous covariate's effect when skewness was not adequately addressed. For missingness in the binary covariate, all estimates had negligible bias but gains in precision from multiple imputation were minimal, particularly for the coefficient of the binary exposure. CONCLUSIONS: Although multiple imputation can be useful if covariates required for confounding adjustment are missing, benefits are likely to be minimal when data are missing in the exposure variable of interest. Furthermore, when there are large amounts of missingness, multiple imputation can become unreliable and introduce bias not present in a complete case analysis if the imputation model is not appropriate. Epidemiologists dealing with missing data should keep in mind the potential limitations as well as the potential benefits of multiple imputation. Further work is needed to provide clearer guidelines on effective application of this method.  相似文献   

9.
Overcoming bias due to confounding and missing data is challenging when analyzing observational data. Propensity scores are commonly used to account for the first problem and multiple imputation for the latter. Unfortunately, it is not known how best to proceed when both techniques are required. We investigate whether two different approaches to combining propensity scores and multiple imputation (Across and Within) lead to differences in the accuracy or precision of exposure effect estimates. Both approaches start by imputing missing values multiple times. Propensity scores are then estimated for each resulting dataset. Using the Across approach, the mean propensity score across imputations for each subject is used in a single subsequent analysis. Alternatively, the Within approach uses propensity scores individually to obtain exposure effect estimates in each imputation, which are combined to produce an overall estimate. These approaches were compared in a series of Monte Carlo simulations and applied to data from the British Society for Rheumatology Biologics Register. Results indicated that the Within approach produced unbiased estimates with appropriate confidence intervals, whereas the Across approach produced biased results and unrealistic confidence intervals. Researchers are encouraged to implement the Within approach when conducting propensity score analyses with incomplete data.  相似文献   

10.

Background  

Longitudinal assessments of quality of life are needed to measure changes over the course of a disease and treatment. Computer versions of quality of life instruments have increased the feasibility of obtaining longitudinal measurements. However, there remain occasions when patients are not able to complete these questionnaires. This study examined whether changes measured using a computer version of the Functional Assessment of Cancer Therapy - General (FACT-G) on two occasions would be obtained if patients completed a paper version on one of the two occasions.  相似文献   

11.
ObjectiveMeta-analysis yields a biased result if published studies represent a biased selection of the evidence. Copas proposed a selection model to assess the sensitivity of meta-analysis conclusions to possible selection bias. An alternative proposal is the trim-and-fill method. This article reports an empirical comparison of the two methods.Study Design and SettingWe took 157 meta-analyses with binary outcomes, analyzed each one using both methods, then performed an automated comparison of the results. We compared the treatment estimates, standard errors, associated P-values, and number of missing studies estimated by both methods.ResultsBoth methods give similar point estimates, but standard errors and P-values are systematically larger for the trim-and-fill method. Furthermore, P-values from the trim-and-fill method are typically larger than those from the usual random effects model when no selection bias is detected. By contrast, P-values from the Copas selection model and the usual random effects model are similar in this setting. The trim-and-fill method reports more missing studies than the Copas selection model, unless selection bias is detected when the position is reversed.ConclusionsThe assumption that the most extreme studies are missing leads to excessively conservative inference in practice for the trim-and-fill method. The Copas selection model appears to be the preferable approach.  相似文献   

12.
We propose a propensity score-based multiple imputation (MI) method to tackle incomplete missing data resulting from drop-outs and/or intermittent skipped visits in longitudinal clinical trials with binary responses. The estimation and inferential properties of the proposed method are contrasted via simulation with those of the commonly used complete-case (CC) and generalized estimating equations (GEE) methods. Three key results are noted. First, if data are missing completely at random, MI can be notably more efficient than the CC and GEE methods. Second, with small samples, GEE often fails due to 'convergence problems', but MI is free of that problem. Finally, if the data are missing at random, while the CC and GEE methods yield results with moderate to large bias, MI generally yields results with negligible bias. A numerical example with real data is provided for illustration.  相似文献   

13.
When competing risks data arise, information on the actual cause of failure for some subjects might be missing. Therefore, a cause-specific proportional hazards model together with multiple imputation (MI) methods have been used to analyze such data. Modelling the cumulative incidence function is also of interest, and thus we investigate the proportional subdistribution hazards model (Fine and Gray model) together with MI methods as a modelling approach for competing risks data with missing cause of failure. Possible strategies for analyzing such data include the complete case analysis as well as an analysis where the missing causes are classified as an additional failure type. These approaches, however, may produce misleading results in clinical settings. In the present work we investigate the bias of the parameter estimates when fitting the Fine and Gray model in the above modelling approaches. We also apply the MI method and evaluate its comparative performance under various missing data scenarios. Results from simulation experiments showed that there is substantial bias in the estimates when fitting the Fine and Gray model with naive techniques for missing data, under missing at random cause of failure. Compared to those techniques the MI-based method gave estimates with much smaller biases and coverage probabilities of 95 per cent confidence intervals closer to the nominal level. All three methods were also applied on real data modelling time to AIDS or non-AIDS cause of death in HIV-1 infected individuals.  相似文献   

14.
Medical scientific research involving multiple measurements in patients is usually complicated by missing values. In case of missing values the choice is to limit the analysis to the complete cases or to analyse all available data. Both methods may suffer from substantial bias and may only be applied in a valid way if the rather strong assumption of 'missing completely at random' holds for the missing values, i.e. the missing value is not related to the other measured data nor to unmeasured data. Two other statistical methods may be applied to deal with missing values: the likelihood approach and the multiple imputation method. These methods make efficient use of all available data and take into account information implied by the available data. These methods are valid under the less stringent assumption of 'missing at random', i.e. the missing value is related to the other measured data, but not to unmeasured data. The best approach is to ensure that no data are missing.  相似文献   

15.
During drug development, a key step is the identification of relevant covariates predicting between-subject variations in drug response. The full random effects model (FREM) is one of the full-covariate approaches used to identify relevant covariates in nonlinear mixed effects models. Here we explore the ability of FREM to handle missing (both missing completely at random (MCAR) and missing at random (MAR)) covariate data and compare it to the full fixed-effects model (FFEM) approach, applied either with complete case analysis or mean imputation. A global health dataset (20 421 children) was used to develop a FREM describing the changes of height for age Z-score (HAZ) over time. Simulated datasets (n = 1000) were generated with variable rates of missing (MCAR) covariate data (0%-90%) and different proportions of missing (MAR) data condition on either observed covariates or predicted HAZ. The three methods were used to re-estimate model and compared in terms of bias and precision which showed that FREM had only minor increases in bias and minor loss of precision at increasing percentages of missing (MCAR) covariate data and performed similarly in the MAR scenarios. Conversely, the FFEM approaches either collapsed at 70% of missing (MCAR) covariate data (FFEM complete case analysis) or had large bias increases and loss of precision (FFEM with mean imputation). Our results suggest that FREM is an appropriate approach to covariate modeling for datasets with missing (MCAR and MAR) covariate data, such as in global health studies.  相似文献   

16.
A case study is presented assessing the impact of missing data on the analysis of daily diary data from a study evaluating the effect of a drug for the treatment of insomnia. The primary analysis averaged daily diary values for each patient into a weekly variable. Following the commonly used approach, missing daily values within a week were ignored provided there was a minimum number of diary reports (i.e., at least 4). A longitudinal model was then fit with treatment, time, and patient‐specific effects. A treatment effect at a pre‐specified landmark time was obtained from the model. Weekly values following dropout were regarded as missing, but intermittent daily missing values were obscured. Graphical summaries and tables are presented to characterize the complex missing data patterns. We use multiple imputation for daily diary data to create completed data sets so that exactly 7 daily diary values contribute to each weekly patient average. Standard analysis methods are then applied for landmark analysis of the completed data sets, and the resulting estimates are combined using the standard multiple imputation approach. The observed data are subject to digit heaping and patterned responses (e.g., identical values for several consecutive days), which makes accurate modeling of the response data difficult. Sensitivity analyses under different modeling assumptions for the data were performed, along with pattern mixture models assessing the sensitivity to the missing at random assumption. The emphasis is on graphical displays and computational methods that can be implemented with general‐purpose software. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

17.
Incomplete and unbalanced multivariate data often arise in longitudinal studies due to missing or unequally-timed repeated measurements and/or the presence of time-varying covariates. A general approach to analysing such data is through maximum likelihood analysis using a linear model for the expected responses, and structural models for the within-subject covariances. Two important advantages of this approach are: (1) the generality of the model allows the analyst to consider a wider range of models than were previously possible using classical methods developed for balanced and complete data, and (2) maximum likelihood estimates obtained from incomplete data are often preferable to other estimates such as those obtained from complete cases from the standpoint of bias and efficiency. A variety of applications of the model are discussed, including univariate and multivariate analysis of incomplete repeated measures data, analysis of growth curves with missing data using random effects and time-series models, and applications to unbalanced longitudinal data.  相似文献   

18.
Attrition threatens the internal validity of cohort studies. Epidemiologists use various imputation and weighting methods to limit bias due to attrition. However, the ability of these methods to correct for attrition bias has not been tested. We simulated a cohort of 300 subjects using 500 computer replications to determine whether regression imputation, individual weighting, or multiple imputation is useful to reduce attrition bias. We compared these results to a complete subject analysis. Our logistic regression model included a binary exposure and two confounders. We generated 10, 25, and 40% attrition through three missing data mechanisms: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR), and used four covariance matrices to vary attrition. We compared true and estimated mean odds ratios (ORs), standard deviations (SDs), and coverage. With data MCAR and MAR for all attrition rates, the complete subject analysis produced results at least as valid as those from the imputation and weighting methods. With data MNAR, no method provided unbiased estimates of the OR at attrition rates of 25 or 40%. When observations are not MAR or MCAR, imputation and weighting methods may not effectively reduce attrition bias.  相似文献   

19.
Missing data are common in longitudinal studies due to drop‐out, loss to follow‐up, and death. Likelihood‐based mixed effects models for longitudinal data give valid estimates when the data are missing at random (MAR). These assumptions, however, are not testable without further information. In some studies, there is additional information available in the form of an auxiliary variable known to be correlated with the missing outcome of interest. Availability of such auxiliary information provides us with an opportunity to test the MAR assumption. If the MAR assumption is violated, such information can be utilized to reduce or eliminate bias when the missing data process depends on the unobserved outcome through the auxiliary information. We compare two methods of utilizing the auxiliary information: joint modeling of the outcome of interest and the auxiliary variable, and multiple imputation (MI). Simulation studies are performed to examine the two methods. The likelihood‐based joint modeling approach is consistent and most efficient when correctly specified. However, mis‐specification of the joint distribution can lead to biased results. MI is slightly less efficient than a correct joint modeling approach and can also be biased when the imputation model is mis‐specified, though it is more robust to mis‐specification of the imputation distribution when all the variables affecting the missing data mechanism and the missing outcome are included in the imputation model. An example is presented from a dementia screening study. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

20.
Histologic and genetic markers can sometimes make it possible to refine a disease into subtypes. In a case-control study, an attempt to subcategorize a disease in this way can be important to elucidating its etiology if the subtypes tend to result from distinct causal pathways. Using subtyped case outcomes, one can carry out either a case-case analysis to investigate etiologic heterogeneity or do polytomous logistic regression to estimate odds ratios specific to subtypes. Unfortunately, especially when such an analysis is undertaken after the study has been completed, it may be compromised by the unavailability of tissue specimens, resulting in missing subtype data for many enrolled cases. The authors propose that one can more fully use the available data, including that provided by cases with missing subtype, by using the expectation-maximization algorithm to estimate risk parameters. For illustration, they apply the method to a study of non-Hodgkin's lymphoma in the midwestern United States. The simulations then demonstrate that, under assumptions likely to hold in many settings, the approach eliminates bias that would arise if unclassified cases were ignored and also improves the precision of estimation. Under the same assumptions, empirical confidence interval coverage is consistent with the nominal 95%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号