共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider the estimation of the regression of an outcome Y on a covariate X, where X is unobserved, but a variable W that measures X with error is observed. A calibration sample that measures pairs of values of X and W is also available; we consider calibration samples where Y is measured (internal calibration) and not measured (external calibration). One common approach for measurement error correction is Regression Calibration (RC), which substitutes the unknown values of X by predictions from the regression of X on W estimated from the calibration sample. An alternative approach is to multiply impute the missing values of X given Y and W based on an imputation model, and then use multiple imputation (MI) combining rules for inferences. Most of current work assumes that the measurement error of W has a constant variance, whereas in many situations, the variance varies as a function of X. We consider extensions of the RC and MI methods that allow for heteroscedastic measurement error, and compare them by simulation. The MI method is shown to provide better inferences in this setting. We also illustrate the proposed methods using a data set from the BioCycle study. 相似文献
2.
Exposure measurement error is a problem in many epidemiological studies, including those using biomarkers and measures of dietary intake. Measurement error typically results in biased estimates of exposure‐disease associations, the severity and nature of the bias depending on the form of the error. To correct for the effects of measurement error, information additional to the main study data is required. Ideally, this is a validation sample in which the true exposure is observed. However, in many situations, it is not feasible to observe the true exposure, but there may be available one or more repeated exposure measurements, for example, blood pressure or dietary intake recorded at two time points. The aim of this paper is to provide a toolkit for measurement error correction using repeated measurements. We bring together methods covering classical measurement error and several departures from classical error: systematic, heteroscedastic and differential error. The correction methods considered are regression calibration, which is already widely used in the classical error setting, and moment reconstruction and multiple imputation, which are newer approaches with the ability to handle differential error. We emphasize practical application of the methods in nutritional epidemiology and other fields. We primarily consider continuous exposures in the exposure‐outcome model, but we also outline methods for use when continuous exposures are categorized. The methods are illustrated using the data from a study of the association between fibre intake and colorectal cancer, where fibre intake is measured using a diet diary and repeated measures are available for a subset. © 2014 The Authors. Statistics in Medicine Published by John Wiley & Sons, Ltd. 相似文献
3.
In biomedical research such as the development of vaccines for infectious diseases or cancer, study outcomes measured by an assay or device are often collected from multiple sources or laboratories. Measurement error that may vary between laboratories needs to be adjusted for when combining samples across data sources. We incorporate such adjustment in the main study by comparing and combining independent samples from different laboratories via integration of external data, collected on paired samples from the same two laboratories. We propose the following: (i) normalization of individual‐level data from two laboratories to the same scale via the expectation of true measurements conditioning on the observed; (ii) comparison of mean assay values between two independent samples in the main study accounting for inter‐source measurement error; and (iii) sample size calculations of the paired‐sample study so that hypothesis testing error rates are appropriately controlled in the main study comparison. Because the goal is not to estimate the true underlying measurements but to combine data on the same scale, our proposed methods do not require that the true values for the error‐prone measurements are known in the external data. Simulation results under a variety of scenarios demonstrate satisfactory finite sample performance of our proposed methods when measurement errors vary. We illustrate our methods using real enzyme‐linked immunosorbent spot assay data generated by two HIV vaccine laboratories. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
4.
Regression calibration (RC) is a popular method for estimating regression coefficients when one or more continuous explanatory variables, X, are measured with an error. In this method, the mismeasured covariate, W, is substituted by the expectation E(X|W), based on the assumption that the error in the measurement of X is non-differential. Using simulations, we compare three versions of RC with two other 'substitution' methods, moment reconstruction (MR) and imputation (IM), neither of which rely on the non-differential error assumption. We investigate studies that have an internal calibration sub-study. For RC, we consider (i) the usual version of RC, (ii) RC applied only to the 'marker' information in the calibration study, and (iii) an 'efficient' version (ERC) in which the estimators (i) and (ii) are combined. Our results show that ERC is preferable when there is non-differential measurement error. Under this condition, there are cases where ERC is less efficient than MR or IM, but they rarely occur in epidemiology. We show that the efficiency gain of usual RC and ERC over the other methods can sometimes be dramatic. The usual version of RC carries similar efficiency gains to ERC over MR and IM, but becomes unstable as measurement error becomes large, leading to bias and poor precision. When differential measurement error does pertain, then MR and IM have considerably less bias than RC, but can have much larger variance. We demonstrate our findings with an analysis of dietary fat intake and mortality in a large cohort study. 相似文献
5.
Multiple imputation (MI) is one of the most popular methods to deal with missing data, and its use has been rapidly increasing in medical studies. Although MI is rather appealing in practice since it is possible to use ordinary statistical methods for a complete data set once the missing values are fully imputed, the method of imputation is still problematic. If the missing values are imputed from some parametric model, the validity of imputation is not necessarily ensured, and the final estimate for a parameter of interest can be biased unless the parametric model is correctly specified. Nonparametric methods have been also proposed for MI, but it is not so straightforward as to produce imputation values from nonparametrically estimated distributions. In this paper, we propose a new method for MI to obtain a consistent (or asymptotically unbiased) final estimate even if the imputation model is misspecified. The key idea is to use an imputation model from which the imputation values are easily produced and to make a proper correction in the likelihood function after the imputation by using the density ratio between the imputation model and the true conditional density function for the missing variable as a weight. Although the conditional density must be nonparametrically estimated, it is not used for the imputation. The performance of our method is evaluated by both theory and simulation studies. A real data analysis is also conducted to illustrate our method by using the Duke Cardiac Catheterization Coronary Artery Disease Diagnostic Dataset. 相似文献
6.
John Buonaccorsi Agnieszka Prochenka Magne Thoresen Rafal Ploski 《Statistics in medicine》2016,35(22):3987-4007
Motivated by a genetic application, this paper addresses the problem of fitting regression models when the predictor is a proportion measured with error. While the problem of dealing with additive measurement error in fitting regression models has been extensively studied, the problem where the additive error is of a binomial nature has not been addressed. The measurement errors here are heteroscedastic for two reasons; dependence on the underlying true value and changing sampling effort over observations. While some of the previously developed methods for treating additive measurement error with heteroscedasticity can be used in this setting, other methods need modification. A new version of simulation extrapolation is developed, and we also explore a variation on the standard regression calibration method that uses a beta‐binomial model based on the fact that the true value is a proportion. Although most of the methods introduced here can be used for fitting non‐linear models, this paper will focus primarily on their use in fitting a linear model. While previous work has focused mainly on estimation of the coefficients, we will, with motivation from our example, also examine estimation of the variance around the regression line. In addressing these problems, we also discuss the appropriate manner in which to bootstrap for both inferences and bias assessment. The various methods are compared via simulation, and the results are illustrated using our motivating data, for which the goal is to relate the methylation rate of a blood sample to the age of the individual providing the sample. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
7.
It is well known that measurement error in the covariates of regression models generally causes bias in parameter estimates. Correction for such biases requires information concerning the measurement error, which is often in the form of internal validation or replication data. Regression calibration (RC) is a popular approach to correct for covariate measurement error, which involves predicting the true covariate using error‐prone measurements. Likelihood methods have previously been proposed as an alternative approach to estimate the parameters in models affected by measurement error, but have been relatively infrequently employed in medical statistics and epidemiology, partly because of computational complexity and concerns regarding robustness to distributional assumptions. We show how a standard random‐intercepts model can be used to obtain maximum likelihood (ML) estimates when the outcome model is linear or logistic regression under certain normality assumptions, when internal error‐prone replicate measurements are available. Through simulations we show that for linear regression, ML gives more efficient estimates than RC, although the gain is typically small. Furthermore, we show that RC and ML estimates remain consistent even when the normality assumptions are violated. For logistic regression, our implementation of ML is consistent if the true covariate is conditionally normal given the outcome, in contrast to RC. In simulations, this ML estimator showed less bias in situations where RC gives non‐negligible biases. Our proposal makes the ML approach to dealing with covariate measurement error more accessible to researchers, which we hope will improve its viability as a useful alternative to methods such as RC. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献
8.
Considerations for analysis of time‐to‐event outcomes measured with error: Bias and correction with SIMEX 下载免费PDF全文
Eric J. Oh Bryan E. Shepherd Thomas Lumley Pamela A. Shaw 《Statistics in medicine》2018,37(8):1276-1289
For time‐to‐event outcomes, a rich literature exists on the bias introduced by covariate measurement error in regression models, such as the Cox model, and methods of analysis to address this bias. By comparison, less attention has been given to understanding the impact or addressing errors in the failure time outcome. For many diseases, the timing of an event of interest (such as progression‐free survival or time to AIDS progression) can be difficult to assess or reliant on self‐report and therefore prone to measurement error. For linear models, it is well known that random errors in the outcome variable do not bias regression estimates. With nonlinear models, however, even random error or misclassification can introduce bias into estimated parameters. We compare the performance of 2 common regression models, the Cox and Weibull models, in the setting of measurement error in the failure time outcome. We introduce an extension of the SIMEX method to correct for bias in hazard ratio estimates from the Cox model and discuss other analysis options to address measurement error in the response. A formula to estimate the bias induced into the hazard ratio by classical measurement error in the event time for a log‐linear survival model is presented. Detailed numerical studies are presented to examine the performance of the proposed SIMEX method under varying levels and parametric forms of the error in the outcome. We further illustrate the method with observational data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic. 相似文献
9.
A. Guolo 《Statistics in medicine》2014,33(12):2062-2076
This paper investigates the use of SIMEX, a simulation‐based measurement error correction technique, for meta‐analysis of studies involving the baseline risk of subjects in the control group as explanatory variable. The approach accounts for the measurement error affecting the information about either the outcome in the treatment group or the baseline risk available from each study, while requiring no assumption about the distribution of the true unobserved baseline risk. This robustness property, together with the feasibility of computation, makes SIMEX very attractive. The approach is suggested as an alternative to the usual likelihood analysis, which can provide misleading inferential results when the commonly assumed normal distribution for the baseline risk is violated. The performance of SIMEX is compared to the likelihood method and to the moment‐based correction through an extensive simulation study and the analysis of two datasets from the medical literature. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献
10.
Evaluating model‐based imputation methods for missing covariates in regression models with interactions 下载免费PDF全文
Imputation strategies are widely used in settings that involve inference with incomplete data. However, implementation of a particular approach always rests on assumptions, and subtle distinctions between methods can have an impact on subsequent analyses. In this research article, we are concerned with regression models in which the true underlying relationship includes interaction terms. We focus in particular on a linear model with one fully observed continuous predictor, a second partially observed continuous predictor, and their interaction. We derive the conditional distribution of the missing covariate and interaction term given the observed covariate and the outcome variable, and examine the performance of a multiple imputation procedure based on this distribution. We also investigate several alternative procedures that can be implemented by adapting multivariate normal multiple imputation software in ways that might be expected to perform well despite incompatibilities between model assumptions and true underlying relationships among the variables. The methods are compared in terms of bias, coverage, and CI width. As expected, the procedure based on the correct conditional distribution performs well across all scenarios. Just as importantly for general practitioners, several of the approaches based on multivariate normality perform comparably with the correct conditional distribution in a number of circumstances, although interestingly, procedures that seek to preserve the multiplicative relationship between the interaction term and the main‐effects are found to be substantially less reliable. For illustration, the various procedures are applied to an analysis of post‐traumatic stress disorder symptoms in a study of childhood trauma. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
11.
Budtz-Jørgensen E Keiding N Grandjean P Weihe P White RF 《Statistics in medicine》2003,22(19):3089-3100
Non-differential measurement error in the exposure variable is known to attenuate the dose-response relationship. The amount of attenuation introduced in a given situation is not only a function of the precision of the exposure measurement but also depends on the conditional variance of the true exposure given the other independent variables. In addition, confounder effects may also be affected by the exposure measurement error. These difficulties in statistical model development are illustrated by examples from a epidemiological study performed in the Faroe Islands to investigate the adverse health effects of prenatal mercury exposure. 相似文献
12.
Methods to adjust for misclassification in the quantiles for the generalized linear model with measurement error in continuous exposures 下载免费PDF全文
Ching‐Yun Wang Jean De Dieu Tapsoba Catherine Duggan Kristin L Campbell Anne McTiernan 《Statistics in medicine》2016,35(10):1676-1688
In many biomedical studies, covariates of interest may be measured with errors. However, frequently in a regression analysis, the quantiles of the exposure variable are often used as the covariates in the regression analysis. Because of measurement errors in the continuous exposure variable, there could be misclassification in the quantiles for the exposure variable. Misclassification in the quantiles could lead to bias estimation in the association between the exposure variable and the outcome variable. Adjustment for misclassification will be challenging when the gold standard variables are not available. In this paper, we develop two regression calibration estimators to reduce bias in effect estimation. The first estimator is normal likelihood‐based. The second estimator is linearization‐based, and it provides a simple and practical correction. Finite sample performance is examined via a simulation study. We apply the methods to a four‐arm randomized clinical trial that tested exercise and weight loss interventions in women aged 50–75years. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
13.
Mediation analysis is a popular approach to examine the extent to which the effect of an exposure on an outcome is through an intermediate variable (mediator) and the extent to which the effect is direct. When the mediator is mis‐measured, the validity of mediation analysis can be severely undermined. In this paper, we first study the bias of classical, non‐differential measurement error on a continuous mediator in the estimation of direct and indirect causal effects in generalized linear models when the outcome is either continuous or discrete and exposure–mediator interaction may be present. Our theoretical results as well as a numerical study demonstrate that in the presence of non‐linearities, the bias of naive estimators for direct and indirect effects that ignore measurement error can take unintuitive directions. We then develop methods to correct for measurement error. Three correction approaches using method of moments, regression calibration, and SIMEX are compared. We apply the proposed method to the Massachusetts General Hospital lung cancer study to evaluate the effect of genetic variants mediated through smoking on lung cancer risk. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
14.
Fibrinogen Studies Collaboration 《Statistics in medicine》2009,28(7):1067-1092
Within‐person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta‐analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study‐specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long‐term studies in the Fibrinogen Studies Collaboration to assess the relationship between usual levels of plasma fibrinogen and the risk of coronary heart disease, allowing for measurement error in plasma fibrinogen and several confounders. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献
15.
Among several semiparametric models, the Cox proportional hazard model is widely used to assess the association between covariates and the time-to-event when the observed time-to-event is interval-censored. Often, covariates are measured with error. To handle this covariate uncertainty in the Cox proportional hazard model with the interval-censored data, flexible approaches have been proposed. To fill a gap and broaden the scope of statistical applications to analyze time-to-event data with different models, in this paper, a general approach is proposed for fitting the semiparametric linear transformation model to interval-censored data when a covariate is measured with error. The semiparametric linear transformation model is a broad class of models that includes the proportional hazard model and the proportional odds model as special cases. The proposed method relies on a set of estimating equations to estimate the regression parameters and the infinite-dimensional parameter. For handling interval censoring and covariate measurement error, a flexible imputation technique is used. Finite sample performance of the proposed method is judged via simulation studies. Finally, the suggested method is applied to analyze a real data set from an AIDS clinical trial. 相似文献
16.
Outliers, measurement error, and missing data are commonly seen in longitudinal data because of its data collection process. However, no method can address all three of these issues simultaneously. This paper focuses on the robust estimation of partially linear models for longitudinal data with dropouts and measurement error. A new robust estimating equation, simultaneously tackling outliers, measurement error, and missingness, is proposed. The asymptotic properties of the proposed estimator are established under some regularity conditions. The proposed method is easy to implement in practice by utilizing the existing standard generalized estimating equations algorithms. The comprehensive simulation studies show the strength of the proposed method in dealing with longitudinal data with all three features. Finally, the proposed method is applied to data from the Lifestyle Education for Activity and Nutrition study and confirms the effectiveness of the intervention in producing weight loss at month 9. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
17.
A measurement error model proposed previously allows for correlations between subject-specific biases and between random within-subject errors in the surrogates obtained from two modes of measurement. However, most of these model parameters are not identifiable from the standard validation study design, including, importantly, the attenuation factor needed to correct for bias in relative risk estimates due to measurement error. We propose validation study designs that permit estimation and inference for the attenuation factor and other parameters of interest when these correlations are present. We use an estimating equations framework to develop semi-parametric estimators for these parameters, exploiting instrumental variables techniques. The methods are illustrated through application to data from the Nurses' Health Study and Health Professionals' Follow-up Study, and comparisons are made to more restrictive models. 相似文献
18.
Sandra M. Mohammed Lorien S. Dalrymple Damla Şentürk Danh V. Nguyen 《Statistics in medicine》2013,32(5):772-786
The case series model allows for estimation of the relative incidence of events, such as cardiovascular events, within a pre‐specified time window after an exposure, such as an infection. The method requires only cases (individuals with events) and controls for all fixed/time‐invariant confounders. The measurement error case series model extends the original case series model to handle imperfect data, where the timing of an infection (exposure) is not known precisely. In this work, we propose a method for power/sample size determination for the measurement error case series model. Extensive simulation studies are used to assess the accuracy of the proposed sample size formulas. We also examine the magnitude of the relative loss of power due to exposure onset measurement error, compared with the ideal situation where the time of exposure is measured precisely. To facilitate the design of case series studies, we provide publicly available web‐based tools for determining power/sample size for both the measurement error case series model as well as the standard case series model. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
19.
Lipidomics is an emerging field of science that holds the potential to provide a readout of biomarkers for an early detection of a disease. Our objective was to identify an efficient statistical methodology for lipidomics—especially in finding interpretable and predictive biomarkers useful for clinical practice. In two case studies, we address the need for data preprocessing for regression modeling of a binary response. These are based on a normalization step, in order to remove experimental variability, and on a multiple imputation step, to make the full use of the incompletely observed data with potentially informative missingness. Finally, by cross‐validation, we compare stepwise variable selection to penalized regression models on stacked multiple imputed data sets and propose the use of a permutation test as a global test of association. Our results show that, depending on the design of the study, these data preprocessing methods modestly improve the precision of classification, and no clear winner among the variable selection methods is found. Lipidomics profiles are found to be highly important predictors in both of the two case studies. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
20.
Probabilistic record linkage techniques assign match weights to one or more potential matches for those individual records that cannot be assigned ‘unequivocal matches’ across data files. Existing methods select the single record having the maximum weight provided that this weight is higher than an assigned threshold. We argue that this procedure, which ignores all information from matches with lower weights and for some individuals assigns no match, is inefficient and may also lead to biases in subsequent analysis of the linked data. We propose that a multiple imputation framework be utilised for data that belong to records that cannot be matched unequivocally. In this way, the information from all potential matches is transferred through to the analysis stage. This procedure allows for the propagation of matching uncertainty through a full modelling process that preserves the data structure. For purposes of statistical modelling, results from a simulation example suggest that a full probabilistic record linkage is unnecessary and that standard multiple imputation will provide unbiased and efficient parameter estimates. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献