首页 | 本学科首页   官方微博 | 高级检索  
检索        


A Stacked Generalization of 3D Orthogonal Deep Learning Convolutional Neural Networks for Improved Detection of White Matter Hyperintensities in 3D FLAIR Images
Authors:L Umapathy  GG Perez-Carrillo  MB Keerthivasan  JA Rosado-Toro  MI Altbach  B Winegar  C Weinkauf  A Bilgin  for the Alzheimer&#x;s Disease Neuroimaging Initiative
Institution:aFrom the Departments of Electrical and Computer Engineering (L.U., A.B.);bMedical Imaging (L.U., G.G.P.-C., M.B.K., J.A.R.-T., M.I.A., B.W., A.B.);cSurgery (C.W.);dBiomedical Engineering (A.B.), University of Arizona, Tucson, Arizona
Abstract:BACKGROUND AND PURPOSE:Accurate and reliable detection of white matter hyperintensities and their volume quantification can provide valuable clinical information to assess neurologic disease progression. In this work, a stacked generalization ensemble of orthogonal 3D convolutional neural networks, StackGen-Net, is explored for improving automated detection of white matter hyperintensities in 3D T2-FLAIR images.MATERIALS AND METHODS:Individual convolutional neural networks in StackGen-Net were trained on 2.5D patches from orthogonal reformatting of 3D-FLAIR (n = 21) to yield white matter hyperintensity posteriors. A meta convolutional neural network was trained to learn the functional mapping from orthogonal white matter hyperintensity posteriors to the final white matter hyperintensity prediction. The impact of training data and architecture choices on white matter hyperintensity segmentation performance was systematically evaluated on a test cohort (n = 9). The segmentation performance of StackGen-Net was compared with state-of-the-art convolutional neural network techniques on an independent test cohort from the Alzheimer’s Disease Neuroimaging Initiative-3 (n = 20).RESULTS:StackGen-Net outperformed individual convolutional neural networks in the ensemble and their combination using averaging or majority voting. In a comparison with state-of-the-art white matter hyperintensity segmentation techniques, StackGen-Net achieved a significantly higher Dice score (0.76 SD, 0.08], F1-lesion (0.74 SD, 0.13]), and area under precision-recall curve (0.84 SD, 0.09]), and the lowest absolute volume difference (13.3% SD, 9.1%]). StackGen-Net performance in Dice scores (median = 0.74) did not significantly differ (P = .22) from interobserver (median = 0.73) variability between 2 experienced neuroradiologists. We found no significant difference (P = .15) in white matter hyperintensity lesion volumes from StackGen-Net predictions and ground truth annotations.CONCLUSIONS:A stacked generalization of convolutional neural networks, utilizing multiplanar lesion information using 2.5D spatial context, greatly improved the segmentation performance of StackGen-Net compared with traditional ensemble techniques and some state-of-the-art deep learning models for 3D-FLAIR.

White matter hyperintensities (WMHs) correspond to pathologic features of axonal degeneration, demyelination, and gliosis observed within cerebral white matter.1 Clinically, the extent of WMHs in the brain has been associated with cognitive impairment, Alzheimer’s disease and vascular dementia, and increased risk of stroke.2,3 The detection and quantification of WMH volumes to monitor lesion burden evolution and its correlation with clinical outcomes have been of interest in clinical research.4,5 Although the extent of WMHs can be visually scored,6 the categoric nature of such scoring systems makes quantitative evaluation of disease progression difficult. Manually segmenting WMHs is tedious, prone to inter- and intraobserver variability, and is, in most cases, impractical. Thus, there is an increased interest in developing fast, accurate, and reliable computer-aided automated techniques for WMH segmentation.Convolutional neural network (CNN)-based approaches have been successful in several semantic segmentation tasks in medical imaging.7 Recent works have proposed using deep learning–based methods for segmenting WMHs using 2D-FLAIR images.8-11 More recently, a WMH segmentation challenge12 was also organized (http://wmh.isi.uu.nl/) to facilitate comparison of automated segmentation of WMHs of presumed vascular origin in 2D multislice T2-FLAIR images. Architectures that used an ensemble of separately trained CNNs showed promising results in this challenge, with 3 of the top 5 winners using ensemble-based techniques.12Conventional 2D-FLAIR images are typically acquired with thick slices (3–4 mm) and possible slice gaps. Partial volume effects from a thick slice are likely to affect the detection of smaller lesions, both in-plane and out-of-plane. 3D-FLAIR images, with isotropic resolution, have been shown to achieve higher resolution and contrast-to-noise ratio13 and have shown promising results in MS lesion detection using 3D CNNs.14 Additionally, the isotropic resolution enables viewing and evaluation of the images in multiple planes. This multiplanar reformatting of 3D-FLAIR without the use of interpolating kernels is only possible due to the isotropic nature of the acquisition. Network architectures that use information from the 3 orthogonal views have been explored in recent works for CNN-based segmentation of 3D MR imaging data.15 The use of data from multiple planes allows more spatial context during training without the computational burden associated with full 3D training.16 The use of 3 orthogonal views simultaneously mirrors how humans approach this segmentation task.Ensembles of CNNs have been shown to average away the variances in the solution and the choice of model- and configuration-specific behaviors of CNNs.17 Traditionally, the solutions from these separately trained CNNs are combined by averaging or using a majority consensus. In this work, we propose the use of a stacked generalization framework (StackGen-Net) for combining multiplanar lesion information from 3D CNN ensembles to improve the detection of WMH lesions in 3D-FLAIR. A stacked generalization18 framework learns to combine solutions from individual CNNs in the ensemble. We systematically evaluated the performance of this framework and compared it with traditional ensemble techniques, such as averaging or majority voting, and state-of-the-art deep learning techniques.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号