期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An alternative empirical likelihood method in missing response problems and causal inference

Kaili Ren Christopher A. Drummond Pamela S. Brewster Steven T. Haller Jiang Tian Christopher J. Cooper Biao Zhang 《Statistics in medicine》2016,35(27):5009-5028

Missing responses are common problems in medical, social, and economic studies. When responses are missing at random, a complete case data analysis may result in biases. A popular debias method is inverse probability weighting proposed by Horvitz and Thompson. To improve efficiency, Robins et al. proposed an augmented inverse probability weighting method. The augmented inverse probability weighting estimator has a double‐robustness property and achieves the semiparametric efficiency lower bound when the regression model and propensity score model are both correctly specified. In this paper, we introduce an empirical likelihood‐based estimator as an alternative to Qin and Zhang (2007). Our proposed estimator is also doubly robust and locally efficient. Simulation results show that the proposed estimator has better performance when the propensity score is correctly modeled. Moreover, the proposed method can be applied in the estimation of average treatment effect in observational causal inferences. Finally, we apply our method to an observational study of smoking, using data from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

2.

Generalized estimating equations to estimate the ordered stereotype logit model for panel data

Martin Spiess Daniel Fernández Thuong Nguyen Ivy Liu 《Statistics in medicine》2020,39(14):1919-1940

By modeling the effects of predictor variables as a multiplicative function of regression parameters being invariant over categories, and category-specific scalar effects, the ordered stereotype logit model is a flexible regression model for ordinal response variables. In this article, we propose a generalized estimating equations (GEE) approach to estimate the ordered stereotype logit model for panel data based on working covariance matrices, which are not required to be correctly specified. A simulation study compares the performance of GEE estimators based on various working correlation matrices and working covariance matrices using local odds ratios. Estimation of the model is illustrated using a real-world dataset. The results from the simulation study suggest that GEE estimation of this model is feasible in medium-sized and large samples and that estimators based on local odds ratios as realized in this study tend to be less efficient compared with estimators based on a working correlation matrix. For low true correlations, the efficiency gains seem to be rather small and if the working covariance structure is too flexible, the corresponding estimator may even be less efficient compared with the GEE estimator assuming independence. Like for GEE estimators more generally, if the true correlations over time are high, then a working covariance structure which is close to the true structure can lead to considerable efficiency gains compared with assuming independence. 相似文献

3.

Doubly robust estimation of causal effects

Funk MJ Westreich D Wiesen C Stürmer T Brookhart MA Davidian M 《American journal of epidemiology》2011,173(7):761-767

相似文献

4.

Double robust estimator of average causal treatment effect for censored medical cost data

下载免费PDF全文

Xuan Wang Lauren A. Beste Marissa M. Maier Xiao‐Hua Zhou 《Statistics in medicine》2016,35(18):3101-3116

In observational studies, estimation of average causal treatment effect on a patient's response should adjust for confounders that are associated with both treatment exposure and response. In addition, the response, such as medical cost, may have incomplete follow‐up. In this article, a double robust estimator is proposed for average causal treatment effect for right censored medical cost data. The estimator is double robust in the sense that it remains consistent when either the model for the treatment assignment or the regression model for the response is correctly specified. Double robust estimators increase the likelihood the results will represent a valid inference. Asymptotic normality is obtained for the proposed estimator, and an estimator for the asymptotic variance is also derived. Simulation studies show good finite sample performance of the proposed estimator and a real data analysis using the proposed method is provided as illustration. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

5.

Doubly robust generalized estimating equations for longitudinal data

Shaun Seaman Andrew Copas 《Statistics in medicine》2009,28(6):937-955

A popular method for analysing repeated‐measures data is generalized estimating equations (GEE). When response data are missing at random (MAR), two modifications of GEE use inverse‐probability weighting and imputation. The weighted GEE (WGEE) method involves weighting observations by their inverse probability of being observed, according to some assumed missingness model. Imputation methods involve filling in missing observations with values predicted by an assumed imputation model. WGEE are consistent when the data are MAR and the dropout model is correctly specified. Imputation methods are consistent when the data are MAR and the imputation model is correctly specified. Recently, doubly robust (DR) methods have been developed. These involve both a model for probability of missingness and an imputation model for the expectation of each missing observation, and are consistent when either is correct. We describe DR GEE, and illustrate their use on simulated data. We also analyse the INITIO randomized clinical trial of HIV therapy allowing for MAR dropout. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

6.

Best linear inverse probability weighted estimation for two-phase designs and missing covariate regression

Ching-Yun Wang James Dai 《Statistics in medicine》2019,38(15):2783-2796

The inverse probability weighted estimator is often applied to two-phase designs and regression with missing covariates. Inverse probability weighted estimators typically are less efficient than likelihood-based estimators but, in general, are more robust against model misspecification. In this paper, we propose a best linear inverse probability weighted estimator for two-phase designs and missing covariate regression. Our proposed estimator is the projection of the SIPW onto the orthogonal complement of the score space based on a working regression model of the observed covariate data. The efficiency gain is from the use of the association between the outcome variable and the available covariates, which is the working regression model. One advantage of the proposed estimator is that there is no need to calculate the augmented term of the augmented weighted estimator. The estimator can be applied to general missing data problems or two-phase design studies in which the second phase data are obtained in a subcohort. The method can also be applied to secondary trait case-control genetic association studies. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via extensive simulation studies. The methods are applied to a bladder cancer case-control study. 相似文献

7.

Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation

Moore KL van der Laan MJ 《Statistics in medicine》2009,28(1):39-64

Covariate adjustment using linear models for continuous outcomes in randomized trials has been shown to increase efficiency and power over the unadjusted method in estimating the marginal effect of treatment. However, for binary outcomes, investigators generally rely on the unadjusted estimate as the literature indicates that covariate-adjusted estimates based on the logistic regression models are less efficient. The crucial step that has been missing when adjusting for covariates is that one must integrate/average the adjusted estimate over those covariates in order to obtain the marginal effect. We apply the method of targeted maximum likelihood estimation (tMLE) to obtain estimators for the marginal effect using covariate adjustment for binary outcomes. We show that the covariate adjustment in randomized trials using the logistic regression models can be mapped, by averaging over the covariate(s), to obtain a fully robust and efficient estimator of the marginal effect, which equals a targeted maximum likelihood estimator. This tMLE is obtained by simply adding a clever covariate to a fixed initial regression. We present simulation studies that demonstrate that this tMLE increases efficiency and power over the unadjusted method, particularly for smaller sample sizes, even when the regression model is mis-specified. 相似文献

8.

Propensity score and doubly robust methods for estimating the effect of treatment on censored cost

下载免费PDF全文

Jiaqi Li Elizabeth Handorf Justin Bekelman Nandita Mitra 《Statistics in medicine》2016,35(12):1985-1999

The estimation of treatment effects on medical costs is complicated by the need to account for informative censoring, skewness, and the effects of confounders. Because medical costs are often collected from observational claims data, we investigate propensity score (PS) methods such as covariate adjustment, stratification, and inverse probability weighting taking into account informative censoring of the cost outcome. We compare these more commonly used methods with doubly robust (DR) estimation. We then use a machine learning approach called super learner (SL) to choose among conventional cost models to estimate regression parameters in the DR approach and to choose among various model specifications for PS estimation. Our simulation studies show that when the PS model is correctly specified, weighting and DR perform well. When the PS model is misspecified, the combined approach of DR with SL can still provide unbiased estimates. SL is especially useful when the underlying cost distribution comes from a mixture of different distributions or when the true PS model is unknown. We apply these approaches to a cost analysis of two bladder cancer treatments, cystectomy versus bladder preservation therapy, using SEER‐Medicare data. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

9.

The analysis of binary longitudinal data with time-dependent covariates

Guerra MW Shults J Amsterdam J Ten-Have T 《Statistics in medicine》2012,31(10):931-948

We consider longitudinal studies with binary outcomes that are measured repeatedly on subjects over time. The goal of our analysis was to fit a logistic model that relates the expected value of the outcomes with explanatory variables that are measured on each subject. However, additional care must be taken to adjust for the association between the repeated measurements on each subject. We propose a new maximum likelihood method for covariates that may be fixed or time varying. We also implement and make comparisons with two other approaches: generalized estimating equations, which may be more robust to misspecification of the true correlation structure, and alternating logistic regression, which models association via odds ratios that are subject to less restrictive constraints than are correlations. The proposed estimation procedure will yield consistent and asymptotically normal estimates of the regression and correlation parameters if the correlation on consecutive measurements on a subject is correctly specified. Simulations demonstrate that our approach can yield improved efficiency in estimation of the regression parameter; for equally spaced and complete data, the gains in efficiency were greatest for the parameter associated with a time-by-group interaction term and for stronger values of the correlation. For unequally spaced data and with dropout according to a missing-at-random mechanism, MARK1ML with correctly specified consecutive correlations yielded substantial improvements in terms of both bias and efficiency. We present an analysis to demonstrate application of the methods we consider. We also offer an R function for easy implementation of our approach. 相似文献

10.

Quantile regression and empirical likelihood for the analysis of longitudinal data with monotone missing responses due to dropout,with applications to quality of life measurements from clinical trials

Yang Lv Guoyou Qin Zhongyi Zhu Dongsheng Tu 《Statistics in medicine》2019,38(16):2972-2991

The analysis of quality of life (QoL) data can be challenging due to the skewness of responses and the presence of missing data. In this paper, we propose a new weighted quantile regression method for estimating the conditional quantiles of QoL data with responses missing at random. The proposed method makes use of the correlation information within the same subject from an auxiliary mean regression model to enhance the estimation efficiency and takes into account of missing data mechanism. The asymptotic properties of the proposed estimator have been studied and simulations are also conducted to evaluate the performance of the proposed estimator. The proposed method has also been applied to the analysis of the QoL data from a clinical trial on early breast cancer, which motivated this study. 相似文献

11.

Doubly robust inference for targeted minimum loss–based estimation in randomized trials with missing outcome data

下载免费PDF全文

Iván Díaz Mark J. van der Laan 《Statistics in medicine》2017,36(24):3807-3819

Missing outcome data is a crucial threat to the validity of treatment effect estimates from randomized trials. The outcome distributions of participants with missing and observed data are often different, which increases bias. Causal inference methods may aid in reducing the bias and improving efficiency by incorporating baseline variables into the analysis. In particular, doubly robust estimators incorporate 2 nuisance parameters: the outcome regression and the missingness mechanism (ie, the probability of missingness conditional on treatment assignment and baseline variables), to adjust for differences in the observed and unobserved groups that can be explained by observed covariates. To consistently estimate the treatment effect, one of these nuisance parameters must be consistently estimated. Traditionally, nuisance parameters are estimated using parametric models, which often precludes consistency, particularly in moderate to high dimensions. Recent research on missing data has focused on data‐adaptive estimation to help achieve consistency, but the large sample properties of such methods are poorly understood. In this article, we discuss a doubly robust estimator that is consistent and asymptotically normal under data‐adaptive estimation of the nuisance parameters. We provide a formula for an asymptotically exact confidence interval under minimal assumptions. We show that our proposed estimator has smaller finite‐sample bias compared to standard doubly robust estimators. We present a simulation study demonstrating the enhanced performance of our estimators in terms of bias, efficiency, and coverage of the confidence intervals. We present the results of an illustrative example: a randomized, double‐blind phase 2/3 trial of antiretroviral therapy in HIV‐infected persons. 相似文献

12.

Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models

Baojiang Chen Jing Qin 《Statistics in medicine》2014,33(10):1713-1722

In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool‐adjacent‐violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood‐based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

13.

Statistical evidence for GLM regression parameters: a robust likelihood approach

Blume JD Su L Olveda RM McGarvey ST 《Statistics in medicine》2007,26(15):2919-2936

When a likelihood ratio is used to measure the strength of evidence for one hypothesis over another, its reliability (i.e. how often it produces misleading evidence) depends on the specification of the working model. When the working model happens to be the 'true' or 'correct' model, the probability of observing strong misleading evidence is low and controllable. But this is not necessarily the case when the working model is misspecified. Royall and Tsou (J. R. Stat. Soc., Ser. B 2003; 65:391-404) show how to adjust working models to make them robust to misspecification. Likelihood ratios derived from their 'robust adjusted likelihood' are just as reliable (asymptotically) as if the working model were correctly specified in the first place. In this paper, we apply and extend these ideas to the generalized linear model (GLM) regression setting. We provide several illustrations (both from simulated data and real data concerning rates of parasitic infection in Philippine adolescents), show how the required adjustment factor can be obtained from standard statistical software, and draw some connections between this approach and the 'sandwich estimator' for robust standard errors of regression parameters. This substantially broadens the availability and the viability of likelihood methods for measuring statistical evidence in regression settings. 相似文献

14.

Statistical inference for missing data mechanisms

Yang Zhao 《Statistics in medicine》2020,39(28):4325-4333

In the literature of statistical analysis with missing data there is a significant gap in statistical inference for missing data mechanisms especially for nonmonotone missing data, which has essentially restricted the use of the estimation methods which require estimating the missing data mechanisms. For example, the inverse probability weighting methods (Horvitz & Thompson, 1952; Little & Rubin, 2002), including the popular augmented inverse probability weighting (Robins et al, 1994), depend on sufficient models for the missing data mechanisms to reduce estimation bias while improving estimation efficiency. This research proposes a semiparametric likelihood method for estimating missing data mechanisms where an EM algorithm with closed form expressions for both E-step and M-step is used in evaluating the estimate (Zhao et al, 2009; Zhao, 2020). The asymptotic variance of the proposed estimator is estimated from the profile score function. The methods are general and robust. Simulation studies in various missing data settings are performed to examine the finite sample performance of the proposed method. Finally, we analysis the missing data mechanism of Duke cardiac catheterization coronary artery disease diagnostic data to illustrate the method. 相似文献

15.

Correction of bias from non‐random missing longitudinal data using auxiliary information

Cuiling Wang Charles B. Hall 《Statistics in medicine》2010,29(6):671-679

Missing data are common in longitudinal studies due to drop‐out, loss to follow‐up, and death. Likelihood‐based mixed effects models for longitudinal data give valid estimates when the data are missing at random (MAR). These assumptions, however, are not testable without further information. In some studies, there is additional information available in the form of an auxiliary variable known to be correlated with the missing outcome of interest. Availability of such auxiliary information provides us with an opportunity to test the MAR assumption. If the MAR assumption is violated, such information can be utilized to reduce or eliminate bias when the missing data process depends on the unobserved outcome through the auxiliary information. We compare two methods of utilizing the auxiliary information: joint modeling of the outcome of interest and the auxiliary variable, and multiple imputation (MI). Simulation studies are performed to examine the two methods. The likelihood‐based joint modeling approach is consistent and most efficient when correctly specified. However, mis‐specification of the joint distribution can lead to biased results. MI is slightly less efficient than a correct joint modeling approach and can also be biased when the imputation model is mis‐specified, though it is more robust to mis‐specification of the imputation distribution when all the variables affecting the missing data mechanism and the missing outcome are included in the imputation model. An example is presented from a dementia screening study. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

16.

Statistical analysis with missing exposure data measured by proxy respondents: a misclassification problem within a missing‐data problem

Michelle Shardell Gregory E. Hicks 《Statistics in medicine》2014,33(25):4437-4452

In studies of older adults, researchers often recruit proxy respondents, such as relatives or caregivers, when study participants cannot provide self‐reports (e.g., because of illness). Proxies are usually only sought to report on behalf of participants with missing self‐reports; thus, either a participant self‐report or proxy report, but not both, is available for each participant. Furthermore, the missing‐data mechanism for participant self‐reports is not identifiable and may be nonignorable. When exposures are binary and participant self‐reports are conceptualized as the gold standard, substituting error‐prone proxy reports for missing participant self‐reports may produce biased estimates of outcome means. Researchers can handle this data structure by treating the problem as one of misclassification within the stratum of participants with missing self‐reports. Most methods for addressing exposure misclassification require validation data, replicate data, or an assumption of nondifferential misclassification; other methods may result in an exposure misclassification model that is incompatible with the analysis model. We propose a model that makes none of the aforementioned requirements and still preserves model compatibility. Two user‐specified tuning parameters encode the exposure misclassification model. Two proposed approaches estimate outcome means standardized for (potentially) high‐dimensional covariates using multiple imputation followed by propensity score methods. The first method is parametric and uses maximum likelihood to estimate the exposure misclassification model (i.e., the imputation model) and the propensity score model (i.e., the analysis model); the second method is nonparametric and uses boosted classification and regression trees to estimate both models. We apply both methods to a study of elderly hip fracture patients. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

17.

Analyzing semi‐competing risks data with missing cause of informative terminal event

下载免费PDF全文

Renke Zhou Hong Zhu Melissa Bondy Jing Ning 《Statistics in medicine》2017,36(5):738-753

Cancer studies frequently yield multiple event times that correspond to landmarks in disease progression, including non‐terminal events (i.e., cancer recurrence) and an informative terminal event (i.e., cancer‐related death). Hence, we often observe semi‐competing risks data. Work on such data has focused on scenarios in which the cause of the terminal event is known. However, in some circumstances, the information on cause for patients who experience the terminal event is missing; consequently, we are not able to differentiate an informative terminal event from a non‐informative terminal event. In this article, we propose a method to handle missing data regarding the cause of an informative terminal event when analyzing the semi‐competing risks data. We first consider the nonparametric estimation of the survival function for the terminal event time given missing cause‐of‐failure data via the expectation–maximization algorithm. We then develop an estimation method for semi‐competing risks data with missing cause of the terminal event, under a pre‐specified semiparametric copula model. We conduct simulation studies to investigate the performance of the proposed method. We illustrate our methodology using data from a study of early‐stage breast cancer. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

18.

Pseudo-likelihood methods for longitudinal binary data with non-ignorable missing responses and covariates

Parzen M Lipsitz SR Fitzmaurice GM Ibrahim JG Troxel A 《Statistics in medicine》2006,25(16):2784-2796

In this paper we consider longitudinal studies in which the outcome to be measured over time is binary, and the covariates of interest are categorical. In longitudinal studies it is common for the outcomes and any time-varying covariates to be missing due to missed study visits, resulting in non-monotone patterns of missingness. Moreover, the reasons for missed visits may be related to the specific values of the response and/or covariates that should have been obtained, i.e. missingness is non-ignorable. With non-monotone non-ignorable missing response and covariate data, a full likelihood approach is quite complicated, and maximum likelihood estimation can be computationally prohibitive when there are many occasions of follow-up. Furthermore, the full likelihood must be correctly specified to obtain consistent parameter estimates. We propose a pseudo-likelihood method for jointly estimating the covariate effects on the marginal probabilities of the outcomes and the parameters of the missing data mechanism. The pseudo-likelihood requires specification of the marginal distributions of the missingness indicator, outcome, and possibly missing covariates at each occasions, but avoids making assumptions about the joint distribution of the data at two or more occasions. Thus, the proposed method can be considered semi-parametric. The proposed method is an extension of the pseudo-likelihood approach in Troxel et al. to handle binary responses and possibly missing time-varying covariates. The method is illustrated using data from the Six Cities study, a longitudinal study of the health effects of air pollution. 相似文献

19.

Using a monotone single-index model to stabilize the propensity score in missing data problems and causal inference

Jing Qin Tao Yu Pengfei Li Hao Liu Baojiang Chen 《Statistics in medicine》2019,38(8):1442-1458

The augmented inverse weighting method is one of the most popular methods for estimating the mean of the response in causal inference and missing data problems. An important component of this method is the propensity score. Popular parametric models for the propensity score include the logistic, probit, and complementary log-log models. A common feature of these models is that the propensity score is a monotonic function of a linear combination of the explanatory variables. To avoid the need to choose a model, we model the propensity score via a semiparametric single-index model, in which the score is an unknown monotonic nondecreasing function of the given single index. Under this new model, the augmented inverse weighting estimator (AIWE) of the mean of the response is asymptotically linear, semiparametrically efficient, and more robust than existing estimators. Moreover, we have made a surprising observation. The inverse probability weighting and AIWEs based on a correctly specified parametric model may have worse performance than their counterparts based on a nonparametric model. A heuristic explanation of this phenomenon is provided. A real-data example is used to illustrate the proposed methods. 相似文献

20.

Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random

Preisser JS Lohman KK Rathouz PJ 《Statistics in medicine》2002,21(20):3035-3054

The generalized estimating equations (GEE) approach is commonly used to model incomplete longitudinal binary data. When drop-outs are missing at random through dependence on observed responses (MAR), GEE may give biased parameter estimates in the model for the marginal means. A weighted estimating equations approach gives consistent estimation under MAR when the drop-out mechanism is correctly specified. In this approach, observations or person-visits are weighted inversely proportional to their probability of being observed. Using a simulation study, we compare the performance of unweighted and weighted GEE in models for time-specific means of a repeated binary response with MAR drop-outs. Weighted GEE resulted in smaller finite sample bias than GEE. However, when the drop-out model was misspecified, weighted GEE sometimes performed worse than GEE. Weighted GEE with observation-level weights gave more efficient estimates than a weighted GEE procedure with cluster-level weights. 相似文献