首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
It is well known that measurement error in the covariates of regression models generally causes bias in parameter estimates. Correction for such biases requires information concerning the measurement error, which is often in the form of internal validation or replication data. Regression calibration (RC) is a popular approach to correct for covariate measurement error, which involves predicting the true covariate using error‐prone measurements. Likelihood methods have previously been proposed as an alternative approach to estimate the parameters in models affected by measurement error, but have been relatively infrequently employed in medical statistics and epidemiology, partly because of computational complexity and concerns regarding robustness to distributional assumptions. We show how a standard random‐intercepts model can be used to obtain maximum likelihood (ML) estimates when the outcome model is linear or logistic regression under certain normality assumptions, when internal error‐prone replicate measurements are available. Through simulations we show that for linear regression, ML gives more efficient estimates than RC, although the gain is typically small. Furthermore, we show that RC and ML estimates remain consistent even when the normality assumptions are violated. For logistic regression, our implementation of ML is consistent if the true covariate is conditionally normal given the outcome, in contrast to RC. In simulations, this ML estimator showed less bias in situations where RC gives non‐negligible biases. Our proposal makes the ML approach to dealing with covariate measurement error more accessible to researchers, which we hope will improve its viability as a useful alternative to methods such as RC. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

2.
Measurement error arises through a variety of mechanisms. A rich literature exists on the bias introduced by covariate measurement error and on methods of analysis to address this bias. By comparison, less attention has been given to errors in outcome assessment and nonclassical covariate measurement error. We consider an extension of the regression calibration method to settings with errors in a continuous outcome, where the errors may be correlated with prognostic covariates or with covariate measurement error. This method adjusts for the measurement error in the data and can be applied with either a validation subset, on which the true data are also observed (eg, a study audit), or a reliability subset, where a second observation of error prone measurements are available. For each case, we provide conditions under which the proposed method is identifiable and leads to consistent estimates of the regression parameter. When the second measurement on the reliability subset has no error or classical unbiased measurement error, the proposed method is consistent even when the primary outcome and exposures of interest are subject to both systematic and random error. We examine the performance of the method with simulations for a variety of measurement error scenarios and sizes of the reliability subset. We illustrate the method's application using data from the Women's Health Initiative Dietary Modification Trial.  相似文献   

3.
Multistate Markov regression models used for quantifying the effect size of state‐specific covariates pertaining to the dynamics of multistate outcomes have gained popularity. However, the measurements of multistate outcome are prone to the errors of classification, particularly when a population‐based survey/research is involved with proxy measurements of outcome due to cost consideration. Such a misclassification may affect the effect size of relevant covariates such as odds ratio used in the field of epidemiology. We proposed a Bayesian measurement‐error‐driven hidden Markov regression model for calibrating these biased estimates with and without a 2‐stage validation design. A simulation algorithm was developed to assess various scenarios of underestimation and overestimation given nondifferential misclassification (independent of covariates) and differential misclassification (dependent on covariates). We applied our proposed method to the community‐based survey of androgenetic alopecia and found that the effect size of the majority of covariate was inflated after calibration regardless of which type of misclassification. Our proposed Bayesian measurement‐error‐driven hidden Markov regression model is practicable and effective in calibrating the effects of covariates on multistate outcome, but the prior distribution on measurement errors accrued from 2‐stage validation design is strongly recommended.  相似文献   

4.
Measurement error occurs when we observe error‐prone surrogates, rather than true values. It is common in observational studies and especially so in epidemiology, in nutritional epidemiology in particular. Correcting for measurement error has become common, and regression calibration is the most popular way to account for measurement error in continuous covariates. We consider its use in the context where there are validation data, which are used to calibrate the true values given the observed covariates. We allow for the case that the true value itself may not be observed in the validation data, but instead, a so‐called reference measure is observed. The regression calibration method relies on certain assumptions.This paper examines possible biases in regression calibration estimators when some of these assumptions are violated. More specifically, we allow for the fact that (i) the reference measure may not necessarily be an ‘alloyed gold standard’ (i.e., unbiased) for the true value; (ii) there may be correlated random subject effects contributing to the surrogate and reference measures in the validation data; and (iii) the calibration model itself may not be the same in the validation study as in the main study; that is, it is not transportable. We expand on previous work to provide a general result, which characterizes potential bias in the regression calibration estimators as a result of any combination of the violations aforementioned. We then illustrate some of the general results with data from the Norwegian Women and Cancer Study. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

5.
Statistical prediction methods typically require some form of fine‐tuning of tuning parameter(s), with K‐fold cross‐validation as the canonical procedure. For ridge regression, there exist numerous procedures, but common for all, including cross‐validation, is that one single parameter is chosen for all future predictions. We propose instead to calculate a unique tuning parameter for each individual for which we wish to predict an outcome. This generates an individualized prediction by focusing on the vector of covariates of a specific individual. The focused ridge—fridge—procedure is introduced with a 2‐part contribution: First we define an oracle tuning parameter minimizing the mean squared prediction error of a specific covariate vector, and then we propose to estimate this tuning parameter by using plug‐in estimates of the regression coefficients and error variance parameter. The procedure is extended to logistic ridge regression by using parametric bootstrap. For high‐dimensional data, we propose to use ridge regression with cross‐validation as the plug‐in estimate, and simulations show that fridge gives smaller average prediction error than ridge with cross‐validation for both simulated and real data. We illustrate the new concept for both linear and logistic regression models in 2 applications of personalized medicine: predicting individual risk and treatment response based on gene expression data. The method is implemented in the R package fridge.  相似文献   

6.
When modeling longitudinal data, the true values of time‐varying covariates may be unknown because of detection‐limit censoring or measurement error. A common approach in the literature is to empirically model the covariate process based on observed data and then predict the censored values or mismeasured values based on this empirical model. Such an empirical model can be misleading, especially for censored values since the (unobserved) censored values may behave very differently than observed values due to the underlying data‐generation mechanisms or disease status. In this paper, we propose a mechanistic nonlinear covariate model based on the underlying data‐generation mechanisms to address censored values and mismeasured values. Such a mechanistic model is based on solid scientific or biological arguments, so the predicted censored or mismeasured values are more reasonable. We use a Monte Carlo EM algorithm for likelihood inference and apply the methods to an AIDS dataset, where viral load is censored by a lower detection limit. Simulation results confirm that the proposed models and methods offer substantial advantages over existing empirical covariate models for censored and mismeasured covariates.  相似文献   

7.
Li L  Palta M  Shao J 《Statistics in medicine》2004,23(16):2527-2536
We study a linear model in which one of the covariates is measured with error. The surrogate for this covariate is the event count in unit time. We model the event count by a Poisson distribution, the rate of which is the unobserved true covariate. We show that ignoring the measurement error leads to inconsistent estimators of the regression coefficients and propose a set of unbiased estimating equations to correct the bias. The method is computationally simple and does not require using supplemental data as is often the case in other measurement error analyses. No distributional assumption is made for the unobserved covariate. The proposed method is illustrated with an example from the Wisconsin Sleep Cohort Study.  相似文献   

8.
Causal inference practitioners are routinely presented with the challenge of model selection and, in particular, reducing the size of the covariate set with the goal of improving estimation efficiency. Collaborative targeted minimum loss‐based estimation (CTMLE) is a general framework for constructing doubly robust semiparametric causal estimators that data‐adaptively limit model complexity in the propensity score to optimize a preferred loss function. This stepwise complexity reduction is based on a loss function placed on a strategically updated model for the outcome variable through which the error is assessed using cross‐validation. We demonstrate how the existing stepwise variable selection CTMLE can be generalized using regression shrinkage of the propensity score. We present 2 new algorithms that involve stepwise selection of the penalization parameter(s) in the regression shrinkage. Simulation studies demonstrate that, under a misspecified outcome model, mean squared error and bias can be reduced by a CTMLE procedure that separately penalizes individual covariates in the propensity score. We demonstrate these approaches in an example using electronic medical data with sparse indicator covariates to evaluate the relative safety of 2 similarly indicated asthma therapies for pregnant women with moderate asthma.  相似文献   

9.
The potential for bias due to misclassification error in regression analysis is well understood by statisticians and epidemiologists. Assuming little or no available data for estimating misclassification probabilities, investigators sometimes seek to gauge the sensitivity of an estimated effect to variations in the assumed values of those probabilities. We present an intuitive and flexible approach to such a sensitivity analysis, assuming an underlying logistic regression model. For outcome misclassification, we argue that a likelihood‐based analysis is the cleanest and the most preferable approach. In the case of covariate misclassification, we combine observed data on the outcome, error‐prone binary covariate of interest, and other covariates measured without error, together with investigator‐supplied values for sensitivity and specificity parameters, to produce corresponding positive and negative predictive values. These values serve as estimated weights to be used in fitting the model of interest to an appropriately defined expanded data set using standard statistical software. Jackknifing provides a convenient tool for incorporating uncertainty in the estimated weights into valid standard errors to accompany log odds ratio estimates obtained from the sensitivity analysis. Examples illustrate the flexibility of this unified strategy, and simulations suggest that it performs well relative to a maximum likelihood approach carried out via numerical optimization. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

10.
An adjustment for an uncorrelated covariate in a logistic regression changes the true value of an odds ratio for a unit increase in a risk factor. Even when there is no variation due to covariates, the odds ratio for a unit increase in a risk factor also depends on the distribution of the risk factor. We can use an instrumental variable to consistently estimate a causal effect in the presence of arbitrary confounding. With a logistic outcome model, we show that the simple ratio or two‐stage instrumental variable estimate is consistent for the odds ratio of an increase in the population distribution of the risk factor equal to the change due to a unit increase in the instrument divided by the average change in the risk factor due to the increase in the instrument. This odds ratio is conditional within the strata of the instrumental variable, but marginal across all other covariates, and is averaged across the population distribution of the risk factor. Where the proportion of variance in the risk factor explained by the instrument is small, this is similar to the odds ratio from a RCT without adjustment for any covariates, where the intervention corresponds to the effect of a change in the population distribution of the risk factor. This implies that the ratio or two‐stage instrumental variable method is not biased, as has been suggested, but estimates a different quantity to the conditional odds ratio from an adjusted multiple regression, a quantity that has arguably more relevance to an epidemiologist or a policy maker, especially in the context of Mendelian randomization. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
Covariate measurement error is often a feature of scientific data used for regression modelling. The consequences of such errors include a loss of power of tests of significance for the regression parameters corresponding to the true covariates. Power and sample size calculations that ignore covariate measurement error tend to overestimate power and underestimate the actual sample size required to achieve a desired power. In this paper we derive a novel measurement error corrected power function for generalized linear models using a generalized score test based on quasi-likelihood methods. Our power function is flexible in that it is adaptable to designs with a discrete or continuous scalar covariate (exposure) that can be measured with or without error, allows for additional confounding variables and applies to a broad class of generalized regression and measurement error models. A program is described that provides sample size or power for a continuous exposure with a normal measurement error model and a single normal confounder variable in logistic regression. We demonstrate the improved properties of our power calculations with simulations and numerical studies. An example is given from an ongoing study of cancer and exposure to arsenic as measured by toenail concentrations and tap water samples.  相似文献   

12.
Pooling biospecimens prior to performing laboratory assays is a useful tool to reduce costs, achieve minimum volume requirements and mitigate assay measurement error. When estimating the risk of a continuous, pooled exposure on a binary outcome, specialized statistical techniques are required. Current methods include a regression calibration approach, where the expectation of the individual‐level exposure is calculated by adjusting the observed pooled measurement with additional covariate data. While this method employs a linear regression calibration model, we propose an alternative model that can accommodate log‐linear relationships between the exposure and predictive covariates. The proposed model permits direct estimation of the relative risk associated with a log‐transformation of an exposure measured in pools. Published 2016. This article is a U.S. Government work and is in the public domain in the USA  相似文献   

13.
There are many settings in which the distribution of error in a mismeasured covariate varies with the value of another covariate. Take, for example, the case of HIV phylogenetic cluster size, large values of which are an indication of rapid HIV transmission. Researchers wish to find behavioral correlates of HIV phylogenetic cluster size; however, the distribution of its measurement error depends on the correctly measured variable, HIV status, and does not have a mean of zero. Further, it is not feasible to obtain validation data or repeated measurements. We propose an extension of simulation–extrapolation, an estimation technique for bias reduction in the presence of measurement error that does not require validation data and can accommodate errors whose distribution depends on other, error‐free covariates. The proposed extension performs well in simulation, typically exhibiting less bias and variability than either regression calibration or multiple imputation for measurement error. We apply the proposed method to data from the province of Quebec in Canada to examine the association between HIV phylogenetic cluster size and the number of reported sex partners. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

14.
Interactions between treatments and covariates in RCTs are a key topic. Standard methods for modelling treatment–covariate interactions with continuous covariates are categorisation or linear functions. Both approaches are easily criticised, but for different reasons. Multivariable fractional polynomial interactions, an approach based on fractional polynomials with the linear interaction model as the simplest special case, was proposed. Four variants of multivariable fractional polynomial interaction (FLEX1–FLEX4), allowing varying flexibility in functional form, were suggested. However, their properties are unknown, and comparisons with other procedures are unavailable. Additionally, we consider various methods based on categorisation and on cubic regression splines. We present the results of a simulation study to determine the significance level (probability of a type 1 error) of various tests for interaction between a binary covariate (‘treatment effect’) and a continuous covariate in univariate analysis. We consider a simplified setting in which the response variable is conditionally normally distributed, given the continuous covariate. We consider two main cases with the covariate distribution well behaved (approximately symmetric) or badly behaved (positively skewed). We construct nine scenarios with different functional forms for the main effect. In the well‐behaved case, significance levels are in general acceptably close to nominal and are slightly better for the larger sample size (n = 250 and 500 were investigated). In the badly behaved case, departures from nominal are more pronounced for several approaches. For a final assessment of these results and recommendations for practice, a study of power is needed. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
Multiple imputation is commonly used to impute missing covariate in Cox semiparametric regression setting. It is to fill each missing data with more plausible values, via a Gibbs sampling procedure, specifying an imputation model for each missing variable. This imputation method is implemented in several softwares that offer imputation models steered by the shape of the variable to be imputed, but all these imputation models make an assumption of linearity on covariates effect. However, this assumption is not often verified in practice as the covariates can have a nonlinear effect. Such a linear assumption can lead to a misleading conclusion because imputation model should be constructed to reflect the true distributional relationship between the missing values and the observed values. To estimate nonlinear effects of continuous time invariant covariates in imputation model, we propose a method based on B‐splines function. To assess the performance of this method, we conducted a simulation study, where we compared the multiple imputation method using Bayesian splines imputation model with multiple imputation using Bayesian linear imputation model in survival analysis setting. We evaluated the proposed method on the motivated data set collected in HIV‐infected patients enrolled in an observational cohort study in Senegal, which contains several incomplete variables. We found that our method performs well to estimate hazard ratio compared with the linear imputation methods, when data are missing completely at random, or missing at random. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

16.
We present a novel method for variable selection in regression models when covariates are measured with error. The iterative algorithm we propose, M easurement E rror Boost ing (MEBoost), follows a path defined by estimating equations that correct for covariate measurement error. We illustrate the use of MEBoost in practice by analyzing data from the Box Lunch Study, a clinical trial in nutrition where several variables are based on self-report and, hence, measured with error, where we are interested in performing model selection from a large data set to select variables that are related to the number of times a subject binge ate in the last 28 days. Furthermore, we evaluated our method and compared its performance to the recently proposed Convex Conditioned Lasso and to the “naive” Lasso, which does not correct for measurement error through a simulation study. Increasing the degree of measurement error increased prediction error and decreased the probability of accurate covariate selection, but this loss of accuracy occurred to a lesser degree when using MEBoost. Through simulations, we also make a case for the consistency of the model selected.  相似文献   

17.
In most epidemiological investigations, the study units are people, the outcome variable (or the response) is a health‐related event, and the explanatory variables are usually environmental and/or socio‐demographic factors. The fundamental task in such investigations is to quantify the association between the explanatory variables (covariates/exposures) and the outcome variable through a suitable regression model. The accuracy of such quantification depends on how precisely the relevant covariates are measured. In many instances, we cannot measure some of the covariates accurately. Rather, we can measure noisy (mismeasured) versions of them. In statistical terminology, mismeasurement in continuous covariates is known as measurement errors or errors‐in‐variables. Regression analyses based on mismeasured covariates lead to biased inference about the true underlying response–covariate associations. In this paper, we suggest a flexible parametric approach for avoiding this bias when estimating the response–covariate relationship through a logistic regression model. More specifically, we consider the flexible generalized skew‐normal and the flexible generalized skew‐t distributions for modeling the unobserved true exposure. For inference and computational purposes, we use Bayesian Markov chain Monte Carlo techniques. We investigate the performance of the proposed flexible parametric approach in comparison with a common flexible parametric approach through extensive simulation studies. We also compare the proposed method with the competing flexible parametric method on a real‐life data set. Though emphasis is put on the logistic regression model, the proposed method is unified and is applicable to the other generalized linear models, and to other types of non‐linear regression models as well. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
For time‐to‐event outcomes, a rich literature exists on the bias introduced by covariate measurement error in regression models, such as the Cox model, and methods of analysis to address this bias. By comparison, less attention has been given to understanding the impact or addressing errors in the failure time outcome. For many diseases, the timing of an event of interest (such as progression‐free survival or time to AIDS progression) can be difficult to assess or reliant on self‐report and therefore prone to measurement error. For linear models, it is well known that random errors in the outcome variable do not bias regression estimates. With nonlinear models, however, even random error or misclassification can introduce bias into estimated parameters. We compare the performance of 2 common regression models, the Cox and Weibull models, in the setting of measurement error in the failure time outcome. We introduce an extension of the SIMEX method to correct for bias in hazard ratio estimates from the Cox model and discuss other analysis options to address measurement error in the response. A formula to estimate the bias induced into the hazard ratio by classical measurement error in the event time for a log‐linear survival model is presented. Detailed numerical studies are presented to examine the performance of the proposed SIMEX method under varying levels and parametric forms of the error in the outcome. We further illustrate the method with observational data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic.  相似文献   

19.
If change over time is compared in several groups, it is important to take into account baseline values so that the comparison is carried out under the same preconditions. As the observed baseline measurements are distorted by measurement error, it may not be sufficient to include them as covariate. By fitting a longitudinal mixed‐effects model to all data including the baseline observations and subsequently calculating the expected change conditional on the underlying baseline value, a solution to this problem has been provided recently so that groups with the same baseline characteristics can be compared. In this article, we present an extended approach where a broader set of models can be used. Specifically, it is possible to include any desired set of interactions between the time variable and the other covariates, and also, time‐dependent covariates can be included. Additionally, we extend the method to adjust for baseline measurement error of other time‐varying covariates. We apply the methodology to data from the Swiss HIV Cohort Study to address the question if a joint infection with HIV‐1 and hepatitis C virus leads to a slower increase of CD4 lymphocyte counts over time after the start of antiretroviral therapy. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
Common problems to many longitudinal HIV/AIDS, cancer, vaccine, and environmental exposure studies are the presence of a lower limit of quantification of an outcome with skewness and time‐varying covariates with measurement errors. There has been relatively little work published simultaneously dealing with these features of longitudinal data. In particular, left‐censored data falling below a limit of detection may sometimes have a proportion larger than expected under a usually assumed log‐normal distribution. In such cases, alternative models, which can account for a high proportion of censored data, should be considered. In this article, we present an extension of the Tobit model that incorporates a mixture of true undetectable observations and those values from a skew‐normal distribution for an outcome with possible left censoring and skewness, and covariates with substantial measurement error. To quantify the covariate process, we offer a flexible nonparametric mixed‐effects model within the Tobit framework. A Bayesian modeling approach is used to assess the simultaneous impact of left censoring, skewness, and measurement error in covariates on inference. The proposed methods are illustrated using real data from an AIDS clinical study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号