Estimating treatment effects under untestable assumptions with nonignorable missing data |
| |
Authors: | Manuel Gomes Michael G. Kenward Richard Grieve James Carpenter |
| |
Affiliation: | 1. Department of Applied Health Research, University College London, London, UK;2. Department of Medical Statistics, LSHTM, London, UK;3. Department of Health Services Research and Policy, LSHTM, London, UK;4. Department of Medical Statistics, LSHTM, London, UK MRC Clinical Trials Unit, University College London, London, UK |
| |
Abstract: | Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease. |
| |
Keywords: | average treatment effects full-information maximum likelihood Heckman model missing not at random multiple imputation selection models |
|
|