首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Regression calibration (RC) is a popular method for estimating regression coefficients when one or more continuous explanatory variables, X, are measured with an error. In this method, the mismeasured covariate, W, is substituted by the expectation E(X|W), based on the assumption that the error in the measurement of X is non-differential. Using simulations, we compare three versions of RC with two other 'substitution' methods, moment reconstruction (MR) and imputation (IM), neither of which rely on the non-differential error assumption. We investigate studies that have an internal calibration sub-study. For RC, we consider (i) the usual version of RC, (ii) RC applied only to the 'marker' information in the calibration study, and (iii) an 'efficient' version (ERC) in which the estimators (i) and (ii) are combined. Our results show that ERC is preferable when there is non-differential measurement error. Under this condition, there are cases where ERC is less efficient than MR or IM, but they rarely occur in epidemiology. We show that the efficiency gain of usual RC and ERC over the other methods can sometimes be dramatic. The usual version of RC carries similar efficiency gains to ERC over MR and IM, but becomes unstable as measurement error becomes large, leading to bias and poor precision. When differential measurement error does pertain, then MR and IM have considerably less bias than RC, but can have much larger variance. We demonstrate our findings with an analysis of dietary fat intake and mortality in a large cohort study.  相似文献   

2.
Common problems to many longitudinal HIV/AIDS, cancer, vaccine, and environmental exposure studies are the presence of a lower limit of quantification of an outcome with skewness and time‐varying covariates with measurement errors. There has been relatively little work published simultaneously dealing with these features of longitudinal data. In particular, left‐censored data falling below a limit of detection may sometimes have a proportion larger than expected under a usually assumed log‐normal distribution. In such cases, alternative models, which can account for a high proportion of censored data, should be considered. In this article, we present an extension of the Tobit model that incorporates a mixture of true undetectable observations and those values from a skew‐normal distribution for an outcome with possible left censoring and skewness, and covariates with substantial measurement error. To quantify the covariate process, we offer a flexible nonparametric mixed‐effects model within the Tobit framework. A Bayesian modeling approach is used to assess the simultaneous impact of left censoring, skewness, and measurement error in covariates on inference. The proposed methods are illustrated using real data from an AIDS clinical study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
Outliers, measurement error, and missing data are commonly seen in longitudinal data because of its data collection process. However, no method can address all three of these issues simultaneously. This paper focuses on the robust estimation of partially linear models for longitudinal data with dropouts and measurement error. A new robust estimating equation, simultaneously tackling outliers, measurement error, and missingness, is proposed. The asymptotic properties of the proposed estimator are established under some regularity conditions. The proposed method is easy to implement in practice by utilizing the existing standard generalized estimating equations algorithms. The comprehensive simulation studies show the strength of the proposed method in dealing with longitudinal data with all three features. Finally, the proposed method is applied to data from the Lifestyle Education for Activity and Nutrition study and confirms the effectiveness of the intervention in producing weight loss at month 9. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

4.
It is well known that measurement error in the covariates of regression models generally causes bias in parameter estimates. Correction for such biases requires information concerning the measurement error, which is often in the form of internal validation or replication data. Regression calibration (RC) is a popular approach to correct for covariate measurement error, which involves predicting the true covariate using error‐prone measurements. Likelihood methods have previously been proposed as an alternative approach to estimate the parameters in models affected by measurement error, but have been relatively infrequently employed in medical statistics and epidemiology, partly because of computational complexity and concerns regarding robustness to distributional assumptions. We show how a standard random‐intercepts model can be used to obtain maximum likelihood (ML) estimates when the outcome model is linear or logistic regression under certain normality assumptions, when internal error‐prone replicate measurements are available. Through simulations we show that for linear regression, ML gives more efficient estimates than RC, although the gain is typically small. Furthermore, we show that RC and ML estimates remain consistent even when the normality assumptions are violated. For logistic regression, our implementation of ML is consistent if the true covariate is conditionally normal given the outcome, in contrast to RC. In simulations, this ML estimator showed less bias in situations where RC gives non‐negligible biases. Our proposal makes the ML approach to dealing with covariate measurement error more accessible to researchers, which we hope will improve its viability as a useful alternative to methods such as RC. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
Guo Y  Little RJ 《Statistics in medicine》2011,30(18):2278-2294
We consider the estimation of the regression of an outcome Y on a covariate X, where X is unobserved, but a variable W that measures X with error is observed. A calibration sample that measures pairs of values of X and W is also available; we consider calibration samples where Y is measured (internal calibration) and not measured (external calibration). One common approach for measurement error correction is Regression Calibration (RC), which substitutes the unknown values of X by predictions from the regression of X on W estimated from the calibration sample. An alternative approach is to multiply impute the missing values of X given Y and W based on an imputation model, and then use multiple imputation (MI) combining rules for inferences. Most of current work assumes that the measurement error of W has a constant variance, whereas in many situations, the variance varies as a function of X. We consider extensions of the RC and MI methods that allow for heteroscedastic measurement error, and compare them by simulation. The MI method is shown to provide better inferences in this setting. We also illustrate the proposed methods using a data set from the BioCycle study.  相似文献   

6.
Data collected in many epidemiological or clinical research studies are often contaminated with measurement errors that may be of classical or Berkson error type. The measurement error may also be a combination of both classical and Berkson errors and failure to account for both errors could lead to unreliable inference in many situations. We consider regression analysis in generalized linear models when some covariates are prone to a mixture of Berkson and classical errors, and calibration data are available only for some subjects in a subsample. We propose an expected estimating equation approach to accommodate both errors in generalized linear regression analyses. The proposed method can consistently estimate the classical and Berkson error variances based on the available data, without knowing the mixture percentage. We investigated its finite‐sample performance numerically. Our method is illustrated by an application to real data from an HIV vaccine study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

7.
It is known that measurement error leads to bias in assessing exposure effects, which can however, be corrected if independent replicates are available. For expensive replicates, two‐stage (2S) studies that produce data ‘missing by design’, may be preferred over a single‐stage (1S) study, because in the second stage, measurement of replicates is restricted to a sample of first‐stage subjects. Motivated by an occupational study on the acute effect of carbon black exposure on respiratory morbidity, we compare the performance of several bias‐correction methods for both designs in a simulation study: an instrumental variable method (EVROS IV) based on grouping strategies, which had been recommended especially when measurement error is large, the regression calibration and the simulation extrapolation methods. For the 2S design, either the problem of ‘missing’ data was ignored or the ‘missing’ data were imputed using multiple imputations. Both in 1S and 2S designs, in the case of small or moderate measurement error, regression calibration was shown to be the preferred approach in terms of root mean square error. For 2S designs, regression calibration as implemented by Stata software is not recommended in contrast to our implementation of this method; the ‘problematic’ implementation of regression calibration although substantially improved with use of multiple imputations. The EVROS IV method, under a good/fairly good grouping, outperforms the regression calibration approach in both design scenarios when exposure mismeasurement is severe. Both in 1S and 2S designs with moderate or large measurement error, simulation extrapolation severely failed to correct for bias. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

8.
In the development of risk prediction models, predictors are often measured with error. In this paper, we investigate the impact of covariate measurement error on risk prediction. We compare the prediction performance using a costly variable measured without error, along with error‐free covariates, to that of a model based on an inexpensive surrogate along with the error‐free covariates. We consider continuous error‐prone covariates with homoscedastic and heteroscedastic errors, and also a discrete misclassified covariate. Prediction performance is evaluated by the area under the receiver operating characteristic curve (AUC), the Brier score (BS), and the ratio of the observed to the expected number of events (calibration). In an extensive numerical study, we show that (i) the prediction model with the error‐prone covariate is very well calibrated, even when it is mis‐specified; (ii) using the error‐prone covariate instead of the true covariate can reduce the AUC and increase the BS dramatically; (iii) adding an auxiliary variable, which is correlated with the error‐prone covariate but conditionally independent of the outcome given all covariates in the true model, can improve the AUC and BS substantially. We conclude that reducing measurement error in covariates will improve the ensuing risk prediction, unless the association between the error‐free and error‐prone covariates is very high. Finally, we demonstrate how a validation study can be used to assess the effect of mismeasured covariates on risk prediction. These concepts are illustrated in a breast cancer risk prediction model developed in the Nurses' Health Study. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

9.
Estimating and testing interactions in a linear regression model when normally distributed explanatory variables are subject to classical measurement error is complex, since the interaction term is a product of two variables and involves errors of more complex structure. Our aim is to develop simple methods, based on the method of moments (MM) and regression calibration (RC) that yield consistent estimators of the regression coefficients and their standard errors when the model includes one or more interactions. In contrast to previous work using structural equations models framework, our methods allow errors that are correlated with each other and can deal with measurements of relatively low reliability. Using simulations, we show that, under the normality assumptions, the RC method yields estimators with negligible bias and is superior to MM in both bias and variance. We also show that the RC method also yields the correct type I error rate of the test of the interaction. However, when the true covariates are not normally distributed, we recommend using MM. We provide an example relating homocysteine to serum folate and B12 levels.  相似文献   

10.
Exposure measurement error is a problem in many epidemiological studies, including those using biomarkers and measures of dietary intake. Measurement error typically results in biased estimates of exposure‐disease associations, the severity and nature of the bias depending on the form of the error. To correct for the effects of measurement error, information additional to the main study data is required. Ideally, this is a validation sample in which the true exposure is observed. However, in many situations, it is not feasible to observe the true exposure, but there may be available one or more repeated exposure measurements, for example, blood pressure or dietary intake recorded at two time points. The aim of this paper is to provide a toolkit for measurement error correction using repeated measurements. We bring together methods covering classical measurement error and several departures from classical error: systematic, heteroscedastic and differential error. The correction methods considered are regression calibration, which is already widely used in the classical error setting, and moment reconstruction and multiple imputation, which are newer approaches with the ability to handle differential error. We emphasize practical application of the methods in nutritional epidemiology and other fields. We primarily consider continuous exposures in the exposure‐outcome model, but we also outline methods for use when continuous exposures are categorized. The methods are illustrated using the data from a study of the association between fibre intake and colorectal cancer, where fibre intake is measured using a diet diary and repeated measures are available for a subset. © 2014 The Authors. Statistics in Medicine Published by John Wiley & Sons, Ltd.  相似文献   

11.
Nutritional epidemiology relies largely on self‐reported measures of dietary intake, errors in which give biased estimated diet–disease associations. Self‐reported measurements come from questionnaires and food records. Unbiased biomarkers are scarce; however, surrogate biomarkers, which are correlated with intake but not unbiased, can also be useful. It is important to quantify and correct for the effects of measurement error on diet–disease associations. Challenges arise because there is no gold standard, and errors in self‐reported measurements are correlated with true intake and each other. We describe an extended model for error in questionnaire, food record, and surrogate biomarker measurements. The focus is on estimating the degree of bias in estimated diet–disease associations due to measurement error. In particular, we propose using sensitivity analyses to assess the impact of changes in values of model parameters which are usually assumed fixed. The methods are motivated by and applied to measures of fruit and vegetable intake from questionnaires, 7‐day diet diaries, and surrogate biomarker (plasma vitamin C) from over 25000 participants in the Norfolk cohort of the European Prospective Investigation into Cancer and Nutrition. Our results show that the estimated effects of error in self‐reported measurements are highly sensitive to model assumptions, resulting in anything from a large attenuation to a small amplification in the diet–disease association. Commonly made assumptions could result in a large overcorrection for the effects of measurement error. Increased understanding of relationships between potential surrogate biomarkers and true dietary intake is essential for obtaining good estimates of the effects of measurement error in self‐reported measurements on observed diet–disease associations. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

12.
Wang CY  Huang Y 《Statistics in medicine》2003,22(16):2577-2590
We consider regression analysis of a disease outcome in relation to longitudinal data which are observations from a random effects model. The covariate variables of interest are the values of the underlying trajectory at some time points, which may be fixed or subject-specific. Because the underlying random coefficients are unknown, the covariates to the primary model are generally unobserved. In addition, measurements are often not observed at the time points of interest. A motivating example to our model is the effects of age at adiposity rebound and the associated body mass index on the risk of adult obesity. The adiposity rebound is a time point at which the trajectory of a child's body fatness declines to a minimum. This general error in timing problem may be applied to an analysis when time-dependent marker variables follow a polynomial model in which the effect of a local maximum or minimum point may be of interest. It can be seen that directly applying estimated covariates, possibly obtained from estimated time points, may lead to bias. Estimation procedures based on expected estimating equations, regression calibration and simulation extrapolation are applied to this problem.  相似文献   

13.
A measurement error model proposed previously allows for correlations between subject-specific biases and between random within-subject errors in the surrogates obtained from two modes of measurement. However, most of these model parameters are not identifiable from the standard validation study design, including, importantly, the attenuation factor needed to correct for bias in relative risk estimates due to measurement error. We propose validation study designs that permit estimation and inference for the attenuation factor and other parameters of interest when these correlations are present. We use an estimating equations framework to develop semi-parametric estimators for these parameters, exploiting instrumental variables techniques. The methods are illustrated through application to data from the Nurses' Health Study and Health Professionals' Follow-up Study, and comparisons are made to more restrictive models.  相似文献   

14.
B G Armstrong  A S Whittemore  G R Howe 《Statistics in medicine》1989,8(9):1151-63; discussion 1165-6
We propose a method for estimating odds ratios from case-control data in which covariates are subject to measurement error. The measurement error may contain both a random component and a systematic difference between cases and controls (recall bias). A multivariate normal discriminant analysis model is assumed. If the distribution of measurement error is known, then a simple correction to naive (biased) estimates of odds ratios from logistic regression of disease on fallible measurements of covariates removes bias. The same correction yields confidence intervals and significance tests. We apply the proposed methods to data from a case-control study of colon cancer and diet.  相似文献   

15.
The wide availability of multi‐dimensional genomic data has spurred increasing interests in integrating multi‐platform genomic data. Integrative analysis of cancer genome landscape can potentially lead to deeper understanding of the biological process of cancer. We integrate epigenetics (DNA methylation and microRNA expression) and gene expression data in tumor genome to delineate the association between different aspects of the biological processes and brain tumor survival. To model the association, we employ a flexible semiparametric linear transformation model that incorporates both the main effects of these genomic measures as well as the possible interactions among them. We develop variance component tests to examine different coordinated effects by testing various subsets of model coefficients for the genomic markers. A Monte Carlo perturbation procedure is constructed to approximate the null distribution of the proposed test statistics. We further propose omnibus testing procedures to synthesize information from fitting various parsimonious sub‐models to improve power. Simulation results suggest that our proposed testing procedures maintain proper size under the null and outperform standard score tests. We further illustrate the utility of our procedure in two genomic analyses for survival of glioblastoma multiforme patients. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

16.
It is widely acknowledged that the predictive performance of clinical prediction models should be studied in patients that were not part of the data in which the model was derived. Out-of-sample performance can be hampered when predictors are measured differently at derivation and external validation. This may occur, for instance, when predictors are measured using different measurement protocols or when tests are produced by different manufacturers. Although such heterogeneity in predictor measurement between derivation and validation data is common, the impact on the out-of-sample performance is not well studied. Using analytical and simulation approaches, we examined out-of-sample performance of prediction models under various scenarios of heterogeneous predictor measurement. These scenarios were defined and clarified using an established taxonomy of measurement error models. The results of our simulations indicate that predictor measurement heterogeneity can induce miscalibration of prediction and affects discrimination and overall predictive accuracy, to extents that the prediction model may no longer be considered clinically useful. The measurement error taxonomy was found to be helpful in identifying and predicting effects of heterogeneous predictor measurements between settings of prediction model derivation and validation. Our work indicates that homogeneity of measurement strategies across settings is of paramount importance in prediction research.  相似文献   

17.
Identification of the latency period for the effect of a time-varying exposure is key when assessing many environmental, nutritional, and behavioral risk factors. A pre-specified exposure metric involving an unknown latency parameter is often used in the statistical model for the exposure-disease relationship. Likelihood-based methods have been developed to estimate this latency parameter for generalized linear models but do not exist for scenarios where the exposure is measured with error, as is usually the case. Here, we explore the performance of naive estimators for both the latency parameter and the regression coefficients, which ignore exposure measurement error, assuming a linear measurement error model. We prove that, in many scenarios under this general measurement error setting, the least squares estimator for the latency parameter remains consistent, while the regression coefficient estimates are inconsistent as has previously been found in standard measurement error models where the primary disease model does not involve a latency parameter. Conditions under which this result holds are generalized to a wide class of covariance structures and mean functions. The findings are illustrated in a study of body mass index in relation to physical activity in the Health Professionals Follow-Up Study.  相似文献   

18.
In genetic association studies, mixed effects models have been widely used in detecting the pleiotropy effects which occur when one gene affects multiple phenotype traits. In particular, bivariate mixed effects models are useful for describing the association of a gene with a continuous trait and a binary trait. However, such models are inadequate to feature the data with response mismeasurement, a characteristic that is often overlooked. It has been well studied that in univariate settings, ignorance of mismeasurement in variables usually results in biased estimation. In this paper, we consider the setting with a bivariate outcome vector which contains a continuous component and a binary component both subject to mismeasurement. We propose an induced likelihood approach and an EM algorithm method to handle measurement error in continuous response and misclassification in binary response simultaneously. Simulation studies confirm that the proposed methods successfully remove the bias induced from the response mismeasurement.  相似文献   

19.
Song X  Ma S 《Statistics in medicine》2008,27(16):3178-3190
There has been substantial effort devoted to the analysis of censored failure time with covariates that are subject to measurement error. Previous studies have focused on right-censored survival data, but interval-censored survival data with covariate measurement error are yet to be investigated. Our study is partly motivated by analysis of the HIV clinical trial AIDS Clinical Trial Group (ACTG) 175 data, where the occurrence time of AIDS is interval censored and the covariate CD4 count is subject to measurement error. We assume that the data are realized from a proportional hazards model. A multiple augmentation approach is proposed to convert interval-censored data to right-censored data, and the conditional score approach is then employed to account for measurement error. The proposed approach is easy to implement and can be readily extended to other semiparametric models. Extensive simulations show that the proposed approach has satisfactory finite-sample performance. The ACTG 175 data are then analyzed.  相似文献   

20.
Measurement error in covariates can affect the accuracy in count data modeling and analysis. In overdispersion identification, the true mean–variance relationship can be obscured under the influence of measurement error in covariates. In this paper, we propose three tests for detecting overdispersion when covariates are measured with error: a modified score test and two score tests based on the proposed approximate likelihood and quasi‐likelihood, respectively. The proposed approximate likelihood is derived under the classical measurement error model, and the resulting approximate maximum likelihood estimator is shown to have superior efficiency. Simulation results also show that the score test based on approximate likelihood outperforms the test based on quasi‐likelihood and other alternatives in terms of empirical power. By analyzing a real dataset containing the health‐related quality‐of‐life measurements of a particular group of patients, we demonstrate the importance of the proposed methods by showing that the analyses with and without measurement error correction yield significantly different results. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号