首页 | 本学科首页   官方微博 | 高级检索  
     


Factor analysis for survival time prediction with informative censoring and diverse covariates
Authors:Shannon McCurdy  Annette Molinaro  Lior Pachter
Affiliation:1. California Institute for Quantitative Biosciences, University of California, Berkeley, California;2. Department of Neurological Surgery, University of California, San Francisco, California

Division of Epidemiology and Biostatistics, University of California, San Francisco, California;3. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California

Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California

Abstract:Fulfilling the promise of precision medicine requires accurately and precisely classifying disease states. For cancer, this includes prediction of survival time from a surfeit of covariates. Such data presents an opportunity for improved prediction, but also a challenge due to high dimensionality. Furthermore, disease populations can be heterogeneous. Integrative modeling is sensible, as the underlying hypothesis is that joint analysis of multiple covariates provides greater explanatory power than separate analyses. We propose an integrative latent variable model that combines factor analysis for various data types and an exponential proportional hazards (EPH) model for continuous survival time with informative censoring. The factor and EPH models are connected through low-dimensional latent variables that can be interpreted and visualized to identify subpopulations. We use this model to predict survival time. We demonstrate this model's utility in simulation and on four Cancer Genome Atlas datasets: diffuse lower-grade glioma, glioblastoma multiforme, lung adenocarcinoma, and lung squamous cell carcinoma. These datasets have small sample sizes, high-dimensional diverse covariates, and high censorship rates. We compare the predictions from our model to three alternative models. Our model outperforms in simulation and is competitive on real datasets. Furthermore, the low-dimensional visualization for diffuse lower-grade glioma displays known subpopulations.
Keywords:diffuse lower-grade glioma  exponential proportional hazards  factor analysis  glioblastoma multiforme  informative censoring  integrative models  latent variables  lung adenocarcinoma  lung squamous cell carcinoma
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号