Non-linear missing data imputation for healthcare data via index-aware autoencoders |
| |
Authors: | Kabir Sadaf Farrokhvar Leily |
| |
Institution: | 1.Department of Industrial and Management Systems Engineering, West Virginia University, 401 Evansdale Dr, Morgantown, WV, 26505, USA ;2.Department of Systems and Operations Management, California State University Northridge, 18111 Nordhoff St, Northridge, CA, 91330, USA ; |
| |
Abstract: | The availability of data in the healthcare domain provides great opportunities for the discovery of new or hidden patterns in medical data, which can eventually lead to improved clinical decision making. Predictive models play a crucial role in extracting this unknown information from data. However, medical data often contain missing values that can degrade the performance of predictive models. Autoencoder models have been widely used as non-linear functions for the imputation of missing data in fields such as computer vision, transportation, and finance. In this study, we assess the shortcomings of autoencoder models for data imputation and propose modified models to improve imputation performance. To evaluate, we compare the performance of the proposed model with five well-known imputation techniques on six medical datasets and five classification methods. Through extensive experiments, we demonstrate that the proposed non-linear imputation model outperforms the other models for all degrees of missing ratios and leads to the highest disease classification accuracy for all datasets. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|