首页 | 本学科首页   官方微博 | 高级检索  
检索        


Evaluation of marker selection methods and statistical models for chronological age prediction based on DNA methylation
Institution:1. Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi’an Jiaotong University, Xi’an 710004, China;2. Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, Xi’an Jiaotong University, Xi’an 710004, China;3. College of Medicine and Forensics, Xi’an Jiaotong University Health Science Center, Xi’an 710061, China;4. Multi-Omics Innovative Research Center of Forensic Identification, Department of Forensic Genetics; School of Forensic Medicine, Southern Medical University, Guangzhou 510515, China;1. Institute of Legal Medicine, University of Münster, Röntgenstraße 23, 48149 Münster, Germany;2. Institute of Legal Medicine, University of Munich, Nußbaumstraße 26, 80336 Munich, Germany;3. Qiagen GmbH, Qiagen Str. 1, 40724 Hilden, Germany;4. Institute of Biostatistics and Clinical Research, University of Münster, Schmeddingstraße 56, 48149 Münster, Germany;1. Research Centre for Anthropology and Health (CIAS), Department of Life Sciences, University of Coimbra, Portugal;2. Centre for Functional Ecology (CEF), Laboratory of Forensic Anthropology, Department of Life Sciences, University of Coimbra, Portugal;3. National Institute of Legal Medicine and Forensic Sciences, Portugal;4. Faculty of Medicine, University of Coimbra, Portugal;1. National Engineering Laboratory for Forensic Science, Key Laboratory of Forensic Genetics of Ministry of Public Security, Beijing Engineering Research Center of Crime Scene Evidence Examination, Institute of Forensic Science, Ministry of Public Security, Beijing, China;2. CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China;3. University of Chinese Academy of Sciences, Beijing, China;4. Shanxi Medical University, Taiyuan, China
Abstract:In forensic investigation, retrieving biological information from DNA evidence is a promising field of interest. One of the applications is on the estimation of the age of the donor based on DNA methylation. A large number of studies focused on age prediction using the 450 K Human Methylation Beadchip. Various marker selection methods and prediction models have been considered. However, there is a lack of research evaluating different high-dimensional variable selection methods of CpG sites with various models for age prediction. The aim of this study is to evaluate four variable selection methods (forward selection, LASSO, elastic net and SCAD) combined with a classical statistical model and sophisticated machine learning models based on the mean absolute deviation (MAD) and the root-mean-square error (RMSE). We used publicly available 450 K data set containing 991 whole blood samples (age 19–101 years). We found that the multiple linear regression model with 16 markers selected from the forward selection method performed very well in age prediction (MAD = 3.76 years and RMSE = 5.01 years). On the other hand, the highly advanced ultrahigh dimensional variable selection methods and sophisticated machine learning algorithms appeared unnecessary for age prediction based on DNA methylation.
Keywords:DNA methylation  Age prediction  Forward selection  LASSO  Multiple linear regression  Machine learning
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号