首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Image registration, the process of aligning two or more images, is the core technique of many (semi-)automatic medical image analysis tasks. Recent studies have shown that deep learning methods, notably convolutional neural networks (ConvNets), can be used for image registration. Thus far training of ConvNets for registration was supervised using predefined example registrations. However, obtaining example registrations is not trivial. To circumvent the need for predefined examples, and thereby to increase convenience of training ConvNets for image registration, we propose the Deep Learning Image Registration (DLIR) framework for unsupervised affine and deformable image registration. In the DLIR framework ConvNets are trained for image registration by exploiting image similarity analogous to conventional intensity-based image registration. After a ConvNet has been trained with the DLIR framework, it can be used to register pairs of unseen images in one shot. We propose flexible ConvNets designs for affine image registration and for deformable image registration. By stacking multiple of these ConvNets into a larger architecture, we are able to perform coarse-to-fine image registration. We show for registration of cardiac cine MRI and registration of chest CT that performance of the DLIR framework is comparable to conventional image registration while being several orders of magnitude faster.  相似文献   

2.
Transformer, one of the latest technological advances of deep learning, has gained prevalence in natural language processing or computer vision. Since medical imaging bear some resemblance to computer vision, it is natural to inquire about the status quo of Transformers in medical imaging and ask the question: can the Transformer models transform medical imaging? In this paper, we attempt to make a response to the inquiry. After a brief introduction of the fundamentals of Transformers, especially in comparison with convolutional neural networks (CNNs), and highlighting key defining properties that characterize the Transformers, we offer a comprehensive review of the state-of-the-art Transformer-based approaches for medical imaging and exhibit current research progresses made in the areas of medical image segmentation, recognition, detection, registration, reconstruction, enhancement, etc. In particular, what distinguishes our review lies in its organization based on the Transformer’s key defining properties, which are mostly derived from comparing the Transformer and CNN, and its type of architecture, which specifies the manner in which the Transformer and CNN are combined, all helping the readers to best understand the rationale behind the reviewed approaches. We conclude with discussions of future perspectives.  相似文献   

3.
Vision Transformers have recently emerged as a competitive architecture in image classification. The tremendous popularity of this model and its variants comes from its high performance and its ability to produce interpretable predictions. However, both of these characteristics remain to be assessed in depth on retinal images. This study proposes a thorough performance evaluation of several Transformers compared to traditional Convolutional Neural Network (CNN) models for retinal disease classification. Special attention is given to multi-modality imaging (fundus and OCT) and generalization to external data. In addition, we propose a novel mechanism to generate interpretable predictions via attribution maps. Existing attribution methods from Transformer models have the disadvantage of producing low-resolution heatmaps. Our contribution, called Focused Attention, uses iterative conditional patch resampling to tackle this issue. By means of a survey involving four retinal specialists, we validated both the superior interpretability of Vision Transformers compared to the attribution maps produced from CNNs and the relevance of Focused Attention as a lesion detector.  相似文献   

4.
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field’s current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.  相似文献   

5.
In recent years, deep learning technology has shown superior performance in different fields of medical image analysis. Some deep learning architectures have been proposed and used for computational pathology classification, segmentation, and detection tasks. Due to their simple, modular structure, most downstream applications still use ResNet and its variants as the backbone network. This paper proposes a modular group attention block that can capture feature dependencies in medical images in two independent dimensions: channel and space. By stacking these group attention blocks in ResNet-style, we obtain a new ResNet variant called ResGANet. The stacked ResGANet architecture has 1.51–3.47 times fewer parameters than the original ResNet and can be directly used for downstream medical image segmentation tasks. Many experiments show that the proposed ResGANet is superior to state-of-the-art backbone models in medical image classification tasks. Applying it to different segmentation networks can improve the baseline model in medical image segmentation tasks without changing the network architecture. We hope that this work provides a promising method for enhancing the feature representation of convolutional neural networks (CNNs) in the future.  相似文献   

6.
In the past few years, convolutional neural networks (CNNs) have been proven powerful in extracting image features crucial for medical image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are limited in their ability to understand the spatial correspondence between features, which is at the core of image registration. The issue is further exaggerated when it comes to multi-modal image registration, where the appearances of input images can differ significantly. This paper presents a novel cross-modal attention mechanism for correlating features extracted from the multi-modal input images and mapping such correlation to image registration transformation. To efficiently train the developed network, a contrastive learning-based pre-training method is also proposed to aid the network in extracting high-level features across the input modalities for the following cross-modal attention learning. We validated the proposed method on transrectal ultrasound (TRUS) to magnetic resonance (MR) registration, a clinically important procedure that benefits prostate cancer biopsy. Our experimental results demonstrate that for MR-TRUS registration, a deep neural network embedded with the cross-modal attention block outperforms other advanced CNN-based networks with ten times its size. We also incorporated visualization techniques to improve the interpretability of our network, which helps bring insights into the deep learning based image registration methods. The source code of our work is available at https://github.com/DIAL-RPI/Attention-Reg.  相似文献   

7.
Image registration is a fundamental task in medical image analysis. Recently, many deep learning based image registration methods have been extensively investigated due to their comparable performance with the state-of-the-art classical approaches despite the ultra-fast computational time. However, the existing deep learning methods still have limitations in the preservation of original topology during the deformation with registration vector fields. To address this issues, here we present a cycle-consistent deformable image registration, dubbed CycleMorph. The cycle consistency enhances image registration performance by providing an implicit regularization to preserve topology during the deformation. The proposed method is so flexible that it can be applied for both 2D and 3D registration problems for various applications, and can be easily extended to multi-scale implementation to deal with the memory issues in large volume registration. Experimental results on various datasets from medical and non-medical applications demonstrate that the proposed method provides effective and accurate registration on diverse image pairs within a few seconds. Qualitative and quantitative evaluations on deformation fields also verify the effectiveness of the cycle consistency of the proposed method.  相似文献   

8.
Deep learning methods provide state of the art performance for supervised learning based medical image analysis. However it is essential that trained models extract clinically relevant features for downstream tasks as, otherwise, shortcut learning and generalization issues can occur. Furthermore in the medical field, trustability and transparency of current deep learning systems is a much desired property. In this paper we propose an interpretability-guided inductive bias approach enforcing that learned features yield more distinctive and spatially consistent saliency maps for different class labels of trained models, leading to improved model performance. We achieve our objectives by incorporating a class-distinctiveness loss and a spatial-consistency regularization loss term. Experimental results for medical image classification and segmentation tasks show our proposed approach outperforms conventional methods, while yielding saliency maps in higher agreement with clinical experts. Additionally, we show how information from unlabeled images can be used to further boost performance. In summary, the proposed approach is modular, applicable to existing network architectures used for medical imaging applications, and yields improved learning rates, model robustness, and model interpretability.  相似文献   

9.
Incorporating shape information is essential for the delineation of many organs and anatomical structures in medical images. While previous work has mainly focused on parametric spatial transformations applied to reference template shapes, in this paper, we address the Bayesian inference of parametric shape models for segmenting medical images with the objective of providing interpretable results. The proposed framework defines a likelihood appearance probability and a prior label probability based on a generic shape function through a logistic function. A reference length parameter defined in the sigmoid controls the trade-off between shape and appearance information. The inference of shape parameters is performed within an Expectation-Maximisation approach in which a Gauss-Newton optimization stage provides an approximation of the posterior probability of the shape parameters. This framework is applied to the segmentation of cochlear structures from clinical CT images constrained by a 10-parameter shape model. It is evaluated on three different datasets, one of which includes more than 200 patient images. The results show performances comparable to supervised methods and better than previously proposed unsupervised ones. It also enables an analysis of parameter distributions and the quantification of segmentation uncertainty, including the effect of the shape model.  相似文献   

10.
In this paper we present two registration algorithms for the spatio-temporal alignment of cardiac MR image sequences. Both algorithms have the ability to correct spatial misalignment between the images sequences caused by global and local shape differences. In addition, they have the ability to correct temporal misalignment caused by differences in the length of the cardiac cycles and by differences in the dynamic properties of the hearts. The algorithms use a 4D deformable transformation model which is separated into spatial and temporal components. The first registration algorithm optimizes the spatial and temporal transformation models simultaneously, while the second registration algorithm optimizes the temporal transformation component before optimizing the spatial component. For the evaluation of the spatio-temporal registration methods we have acquired 15 MR image sequences from healthy volunteers. The registration methods were quantitatively evaluated by measuring the overlap and surface distance of anatomical regions and qualitatively by visual inspection. The results demonstrate that a significant improvement in the alignment of the image sequences is achieved by the use of the deformable spatio-temporal transformation model. We demonstrate the use of the method for the construction of a probabilistic MR cardiac atlas representing the anatomy and function of a healthy heart.  相似文献   

11.
Monitoring the quality of image segmentation is key to many clinical applications. This quality assessment can be carried out by a human expert when the number of cases is limited. However, it becomes onerous when dealing with large image databases, so partial automation of this process is preferable. Previous works have proposed both supervised and unsupervised methods for the automated control of image segmentations. The former assume the availability of a subset of trusted segmented images on which supervised learning is performed, while the latter does not. In this paper, we introduce a novel unsupervised approach for quality assessment of segmented images based on a generic probabilistic model. Quality estimates are produced by comparing each segmentation with the output of a probabilistic segmentation model that relies on intensity and smoothness assumptions. Ranking cases with respect to these two assumptions allows the most challenging cases in a dataset to be detected. Furthermore, unlike prior work, our approach enables possible segmentation errors to be localized within an image. The proposed generic probabilistic segmentation method combines intensity mixture distributions with spatial regularization prior models whose parameters are estimated with variational Bayesian techniques. We introduce a novel smoothness prior based on the penalization of the derivatives of label maps which allows an automatic estimation of its hyperparameter in a fully data-driven way. Extensive evaluation of quality control on medical and COCO datasets is conducted, showing the ability to isolate atypical segmentations automatically and to predict, in some cases, the performance of segmentation algorithms.  相似文献   

12.
13.
The first step in the spatial normalization of brain images is usually to determine the affine transformation that best maps the image to a template image in a standard space. We have developed a rapid and automatic method for performing this registration, which uses a Bayesian scheme to incorporate prior knowledge of the variability in the shape and size of heads. We compared affine registrations with and without incorporating the prior knowledge. We found that the affine transformations derived using the Bayesian scheme are much more robust and that the rate of convergence is greater.  相似文献   

14.
Fundus camera imaging of the retina is widely used to diagnose and manage ophthalmologic disorders including diabetic retinopathy, glaucoma, and age-related macular degeneration. Retinal images typically have a limited field of view, and multiple images can be joined together using an image registration technique to form a montage with a larger field of view. A variety of methods for retinal image registration have been proposed, but evaluating such methods objectively is difficult due to the lack of a reference standard for the true alignment of the individual images that make up the montage. A method of generating simulated retinal images by modeling the geometric distortions due to the eye geometry and the image acquisition process is described in this paper. We also present a validation process that can be used for any retinal image registration method by tracing through the distortion path and assessing the geometric misalignment in the coordinate system of the reference standard. The proposed method can be used to perform an accuracy evaluation over the whole image, so that distortion in the non-overlapping regions of the montage components can be easily assessed. We demonstrate the technique by generating test image sets with a variety of overlap conditions and compare the accuracy of several retinal image registration models.  相似文献   

15.
提出一种综合应用图像分割与互信息的医学图像自动配准方法.首先采用门限法和数学形态学方法进行预处理,再用k-means方法进行分割,之后采用基于互信息的Powell优化方法配准.将该方法用于磁共振图像(MRI)和正电子发射断层扫描(PET)临床医学图像配准,得到较满意的效果.  相似文献   

16.
Cochlear implants (CIs) are used to treat subjects with hearing loss. In a CI surgery, an electrode array is inserted into the cochlea to stimulate auditory nerves. After surgery, CIs need to be programmed. Studies have shown that the cochlea-electrode spatial relationship derived from medical images can guide CI programming and lead to significant improvement in hearing outcomes. We have developed a series of algorithms to segment the inner ear anatomy and localize the electrodes. But, because clinical head CT images are acquired with different protocols, the field of view and orientation of the image volumes vary greatly. As a consequence, visual inspection and manual image registration to an atlas image are needed to document their content and to initialize intensity-based registration algorithms used in our processing pipeline. For large-scale evaluation and deployment of our methods these steps need to be automated. In this article we propose to achieve this with a deep convolutional neural network (CNN) that can be trained end-to-end to classify a head CT image in terms of its content and to localize landmarks. The detected landmarks can then be used to estimate a point-based registration with the atlas image in which the same landmark set's positions are known. We achieve 99.5% classification accuracy and an average localization error of 3.45 mm for 7 landmarks located around each inner ear. This is better than what was achieved with earlier methods we have proposed for the same tasks.  相似文献   

17.
Discrete optimisation strategies have a number of advantages over their continuous counterparts for deformable registration of medical images. For example: it is not necessary to compute derivatives of the similarity term; dense sampling of the search space reduces the risk of becoming trapped in local optima; and (in principle) an optimum can be found without resorting to iterative coarse-to-fine warping strategies. However, the large complexity of high-dimensional medical data renders a direct voxel-wise estimation of deformation vectors impractical. For this reason, previous work on medical image registration using graphical models has largely relied on using a parameterised deformation model and on the use of iterative coarse-to-fine optimisation schemes. In this paper, we propose an approach that enables accurate voxel-wise deformable registration of high-resolution 3D images without the need for intermediate image warping or a multi-resolution scheme. This is achieved by representing the image domain as multiple comprehensive supervoxel layers and making use of the full marginal distribution of all probable displacement vectors after inferring regularity of the deformations using belief propagation. The optimisation acts on the coarse scale representation of supervoxels, which provides sufficient spatial context and is robust to noise in low contrast areas. Minimum spanning trees, which connect neighbouring supervoxels, are employed to model pair-wise deformation dependencies. The optimal displacement for each voxel is calculated by considering the probabilities for all displacements over all overlapping supervoxel graphs and subsequently seeking the mode of this distribution. We demonstrate the applicability of this concept for two challenging applications: first, for intra-patient motion estimation in lung CT scans; and second, for atlas-based segmentation propagation of MRI brain scans. For lung registration, the voxel-wise mode of displacements is found using the mean-shift algorithm, which enables us to determine continuous valued sub-voxel motion vectors. Finding the mode of brain segmentation labels is performed using a voxel-wise majority voting weighted by the displacement uncertainty estimates. Our experimental results show significant improvements in registration accuracy when using the additional information provided by the registration uncertainty estimates. The multi-layer approach enables fusion of multiple complementary proposals, extending the popular fusion approaches from multi-image registration to probabilistic one-to-one image registration.  相似文献   

18.
Optical coherence tomography (OCT) of the macula has become increasingly important in the investigation of retinal pathology. However, deformable image registration, which is used for aligning subjects for pairwise comparisons, population averaging, and atlas label transfer, has not been well–developed and demonstrated on OCT images. In this paper, we present a deformable image registration approach designed specifically for macular OCT images. The approach begins with an initial translation to align the fovea of each subject, followed by a linear rescaling to align the top and bottom retinal boundaries. Finally, the layers within the retina are aligned by a deformable registration using one-dimensional radial basis functions. The algorithm was validated using manual delineations of retinal layers in OCT images from a cohort consisting of healthy controls and patients diagnosed with multiple sclerosis (MS). We show that the algorithm overcomes the shortcomings of existing generic registration methods, which cannot be readily applied to OCT images. A successful deformable image registration algorithm for macular OCT opens up a variety of population based analysis techniques that are regularly used in other imaging modalities, such as spatial normalization, statistical atlas creation, and voxel based morphometry. Examples of these applications are provided to demonstrate the potential benefits such techniques can have on our understanding of retinal disease. In particular, included is a pilot study of localized volumetric changes between healthy controls and MS patients using the proposed registration algorithm.OCIS codes: (100.0100) Image processing, (170.4470) Ophthalmology, (170.4500) Optical coherence tomography  相似文献   

19.
For simultaneous positron-emission-tomography and magnetic-resonance-imaging (PET-MRI) systems, while early methods relied on independently reconstructing PET and MRI images, recent works have demonstrated improvement in image reconstructions of both PET and MRI using joint reconstruction methods. The current state-of-the-art joint reconstruction priors rely on fine-scale PET-MRI dependencies through the image gradients at corresponding spatial locations in the PET and MRI images. In the general context of image restoration, compared to gradient-based models, patch-based models (e.g., sparse dictionaries) have demonstrated better performance by modeling image texture better. Thus, we propose a novel joint PET-MRI patch-based dictionary prior that learns inter-modality higher-order dependencies together with intra-modality textural patterns in the images. We model the joint-dictionary prior as a Markov random field and propose a novel Bayesian framework for joint reconstruction of PET and accelerated-MRI images, using expectation maximization for inference. We evaluate all methods on simulated brain datasets as well as on in vivo datasets. We compare our joint dictionary prior with the recently proposed joint priors based on image gradients, as well as independently applied patch-based priors. Our method demonstrates qualitative and quantitative improvement over the state of the art in both PET and MRI reconstructions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号