期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Fully convolutional networks (FCNs), including UNet and VNet, are widely-used network architectures for semantic segmentation in recent studies. However, conventional FCN is typically trained by the cross-entropy or Dice loss, which only calculates the error between predictions and ground-truth labels for pixels individually. This often results in non-smooth neighborhoods in the predicted segmentation. This problem becomes more serious in CT prostate segmentation as CT images are usually of low tissue contrast. To address this problem, we propose a two-stage framework, with the first stage to quickly localize the prostate region, and the second stage to precisely segment the prostate by a multi-task UNet architecture. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Therefore, the proposed network has a dual-branch architecture that tackles two tasks: (1) a segmentation sub-network aiming to generate the prostate segmentation, and (2) a voxel-metric learning sub-network aiming to improve the quality of the learned feature space supervised by a metric loss. Specifically, the voxel-metric learning sub-network samples tuples (including triplets and pairs) in voxel-level through the intermediate feature maps. Unlike conventional deep metric learning methods that generate triplets or pairs in image-level before the training phase, our proposed voxel-wise tuples are sampled in an online manner and operated in an end-to-end fashion via multi-task learning. To evaluate the proposed method, we implement extensive experiments on a real CT image dataset consisting 339 patients. The ablation studies show that our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss. And the comparisons show that the proposed method outperforms the state-of-the-art methods by a reasonable margin. 相似文献

Self-supervised driven consistency training for annotation efficient histopathology image analysis

One model is all you need: Multi-task learning enables simultaneous histology image segmentation and classification

Training a neural network with a large labeled dataset is still a dominant paradigm in computational histopathology. However, obtaining such exhaustive manual annotations is often expensive, laborious, and prone to inter and intra-observer variability. While recent self-supervised and semi-supervised methods can alleviate this need by learning unsupervised feature representations, they still struggle to generalize well to downstream tasks when the number of labeled instances is small. In this work, we overcome this challenge by leveraging both task-agnostic and task-specific unlabeled data based on two novel strategies: (i) a self-supervised pretext task that harnesses the underlying multi-resolution contextual cues in histology whole-slide images to learn a powerful supervisory signal for unsupervised representation learning; (ii) a new teacher-student semi-supervised consistency paradigm that learns to effectively transfer the pretrained representations to downstream tasks based on prediction consistency with the task-specific unlabeled data.We carry out extensive validation experiments on three histopathology benchmark datasets across two classification and one regression based tasks, i.e., tumor metastasis detection, tissue type classification, and tumor cellularity quantification. Under limited-label data, the proposed method yields tangible improvements, which is close to or even outperforming other state-of-the-art self-supervised and supervised baselines. Furthermore, we empirically show that the idea of bootstrapping the self-supervised pretrained features is an effective way to improve the task-specific semi-supervised learning on standard benchmarks. Code and pretrained models are made available at: https://github.com/srinidhiPY/SSL_CR_Histo. 相似文献

Thyroid nodule segmentation and classification in ultrasound images through intra- and inter-task consistent learning

The recent surge in performance for image analysis of digitised pathology slides can largely be attributed to the advances in deep learning. Deep models can be used to initially localise various structures in the tissue and hence facilitate the extraction of interpretable features for biomarker discovery. However, these models are typically trained for a single task and therefore scale poorly as we wish to adapt the model for an increasing number of different tasks. Also, supervised deep learning models are very data hungry and therefore rely on large amounts of training data to perform well. In this paper, we present a multi-task learning approach for segmentation and classification of nuclei, glands, lumina and different tissue regions that leverages data from multiple independent data sources. While ensuring that our tasks are aligned by the same tissue type and resolution, we enable meaningful simultaneous prediction with a single network. As a result of feature sharing, we also show that the learned representation can be used to improve the performance of additional tasks via transfer learning, including nuclear classification and signet ring cell detection. As part of this work, we train our developed Cerberus model on a huge amount of data, consisting of over 600 thousand objects for segmentation and 440 thousand patches for classification. We use our approach to process 599 colorectal whole-slide images from TCGA, where we localise 377 million, 900 thousand and 2.1 million nuclei, glands and lumina respectively. We make this resource available to remove a major barrier in the development of explainable models for computational pathology. 相似文献

Context-guided fully convolutional networks for joint craniomaxillofacial bone segmentation and landmark digitization

Thyroid nodule segmentation and classification in ultrasound images are two essential but challenging tasks for computer-aided diagnosis of thyroid nodules. Since these two tasks are inherently related to each other and sharing some common features, solving them jointly with multi-task leaning is a promising direction. However, both previous studies and our experimental results confirm the problem of inconsistent predictions among these related tasks. In this paper, we summarize two types of task inconsistency according to the relationship among different tasks: intra-task inconsistency between homogeneous tasks (e.g., both tasks are pixel-wise segmentation tasks); and inter-task inconsistency between heterogeneous tasks (e.g., pixel-wise segmentation task and categorical classification task). To address the task inconsistency problems, we propose intra- and inter-task consistent learning on top of the designed multi-stage and multi-task learning network to enforce the network learn consistent predictions for all the tasks during network training. Our experimental results based on a large clinical thyroid ultrasound image dataset indicate that the proposed intra- and inter-task consistent learning can effectively eliminate both types of task inconsistency and thus improve the performance of all tasks for thyroid nodule segmentation and classification. 相似文献

《Medical image analysis》2020

Cone-beam computed tomography (CBCT) scans are commonly used in diagnosing and planning surgical or orthodontic treatment to correct craniomaxillofacial (CMF) deformities. Based on CBCT images, it is clinically essential to generate an accurate 3D model of CMF structures (e.g., midface, and mandible) and digitize anatomical landmarks. This process often involves two tasks, i.e., bone segmentation and anatomical landmark digitization. Because landmarks usually lie on the boundaries of segmented bone regions, the tasks of bone segmentation and landmark digitization could be highly associated. Also, the spatial context information (e.g., displacements from voxels to landmarks) in CBCT images is intuitively important for accurately indicating the spatial association between voxels and landmarks. However, most of the existing studies simply treat bone segmentation and landmark digitization as two standalone tasks without considering their inherent relationship, and rarely take advantage of the spatial context information contained in CBCT images. To address these issues, we propose a Joint bone Segmentation and landmark Digitization (JSD) framework via context-guided fully convolutional networks (FCNs). Specifically, we first utilize displacement maps to model the spatial context information in CBCT images, where each element in the displacement map denotes the displacement from a voxel to a particular landmark. An FCN is learned to construct the mapping from the input image to its corresponding displacement maps. Using the learned displacement maps as guidance, we further develop a multi-task FCN model to perform bone segmentation and landmark digitization jointly. We validate the proposed JSD method on 107 subjects, and the experimental results demonstrate that our method is superior to the state-of-the-art approaches in both tasks of bone segmentation and landmark digitization. 相似文献

Unsupervised anomaly localization in high-resolution breast scans using deep pluralistic image completion

Histogram of Oriented Gradients meet deep learning: A novel multi-task deep network for 2D surgical image semantic segmentation

Automated tumor detection in Digital Breast Tomosynthesis (DBT) is a difficult task due to natural tumor rarity, breast tissue variability, and high resolution. Given the scarcity of abnormal images and the abundance of normal images for this problem, an anomaly detection/localization approach could be well-suited. However, most anomaly localization research in machine learning focuses on non-medical datasets, and we find that these methods fall short when adapted to medical imaging datasets. The problem is alleviated when we solve the task from the image completion perspective, in which the presence of anomalies can be indicated by a discrepancy between the original appearance and its auto-completion conditioned on the surroundings. However, there are often many valid normal completions given the same surroundings, especially in the DBT dataset, making this evaluation criterion less precise. To address such an issue, we consider pluralistic image completion by exploring the distribution of possible completions instead of generating fixed predictions. This is achieved through our novel application of spatial dropout on the completion network during inference time only, which requires no additional training cost and is effective at generating diverse completions. We further propose minimum completion distance (MCD), a new metric for detecting anomalies, thanks to these stochastic completions. We provide theoretical as well as empirical support for the superiority over existing methods of using the proposed method for anomaly localization. On the DBT dataset, our model outperforms other state-of-the-art methods by at least 10% AUROC for pixel-level detection. 相似文献

We present our novel deep multi-task learning method for medical image segmentation. Existing multi-task methods demand ground truth annotations for both the primary and auxiliary tasks. Contrary to it, we propose to generate the pseudo-labels of an auxiliary task in an unsupervised manner. To generate the pseudo-labels, we leverage Histogram of Oriented Gradients (HOGs), one of the most widely used and powerful hand-crafted features for detection. Together with the ground truth semantic segmentation masks for the primary task and pseudo-labels for the auxiliary task, we learn the parameters of the deep network to minimize the loss of both the primary task and the auxiliary task jointly. We employed our method on two powerful and widely used semantic segmentation networks: UNet and U2Net to train in a multi-task setup. To validate our hypothesis, we performed experiments on two different medical image segmentation data sets. From the extensive quantitative and qualitative results, we observe that our method consistently improves the performance compared to the counter-part method. Moreover, our method is the winner of FetReg Endovis Sub-challenge on Semantic Segmentation organised in conjunction with MICCAI 2021. Code and implementation details are available at:https://github.com/thetna/medical_image_segmentation. 相似文献

United adversarial learning for liver tumor segmentation and detection of multi-modality non-contrast MRI

Simultaneous segmentation and detection of liver tumors (hemangioma and hepatocellular carcinoma (HCC)) by using multi-modality non-contrast magnetic resonance imaging (NCMRI) are crucial for the clinical diagnosis. However, it is still a challenging task due to: (1) the HCC information on NCMRI is insufficient makes extraction of liver tumors feature difficult; (2) diverse imaging characteristics in multi-modality NCMRI causes feature fusion and selection difficult; (3) no specific information between hemangioma and HCC on NCMRI cause liver tumors detection difficult. In this study, we propose a united adversarial learning framework (UAL) for simultaneous liver tumors segmentation and detection using multi-modality NCMRI. The UAL first utilizes a multi-view aware encoder to extract multi-modality NCMRI information for liver tumor segmentation and detection. In this encoder, a novel edge dissimilarity feature pyramid module is designed to facilitate the complementary multi-modality feature extraction. Secondly, the newly designed fusion and selection channel is used to fuse the multi-modality feature and make the decision of the feature selection. Then, the proposed mechanism of coordinate sharing with padding integrates the multi-task of segmentation and detection so that it enables multi-task to perform united adversarial learning in one discriminator. Lastly, an innovative multi-phase radiomics guided discriminator exploits the clear and specific tumor information to improve the multi-task performance via the adversarial learning strategy. The UAL is validated in corresponding multi-modality NCMRI (i.e. T1FS pre-contrast MRI, T2FS MRI, and DWI) and three phases contrast-enhanced MRI of 255 clinical subjects. The experiments show that UAL gains high performance with the dice similarity coefficient of 83.63%, the pixel accuracy of 97.75%, the intersection-over-union of 81.30%, the sensitivity of 92.13%, the specificity of 93.75%, and the detection accuracy of 92.94%, which demonstrate that UAL has great potential in the clinical diagnosis of liver tumors. 相似文献

Transformer-based unsupervised contrastive learning for histopathological image classification

Efficient and Robust Instrument Segmentation in 3D Ultrasound Using Patch-of-Interest-FuseNet with Hybrid Loss

A large-scale and well-annotated dataset is a key factor for the success of deep learning in medical image analysis. However, assembling such large annotations is very challenging, especially for histopathological images with unique characteristics (e.g., gigapixel image size, multiple cancer types, and wide staining variations). To alleviate this issue, self-supervised learning (SSL) could be a promising solution that relies only on unlabeled data to generate informative representations and generalizes well to various downstream tasks even with limited annotations. In this work, we propose a novel SSL strategy called semantically-relevant contrastive learning (SRCL), which compares relevance between instances to mine more positive pairs. Compared to the two views from an instance in traditional contrastive learning, our SRCL aligns multiple positive instances with similar visual concepts, which increases the diversity of positives and then results in more informative representations. We employ a hybrid model (CTransPath) as the backbone, which is designed by integrating a convolutional neural network (CNN) and a multi-scale Swin Transformer architecture. The CTransPath is pretrained on massively unlabeled histopathological images that could serve as a collaborative local–global feature extractor to learn universal feature representations more suitable for tasks in the histopathology image domain. The effectiveness of our SRCL-pretrained CTransPath is investigated on five types of downstream tasks (patch retrieval, patch classification, weakly-supervised whole-slide image classification, mitosis detection, and colorectal adenocarcinoma gland segmentation), covering nine public datasets. The results show that our SRCL-based visual representations not only achieve state-of-the-art performance in each dataset, but are also more robust and transferable than other SSL methods and ImageNet pretraining (both supervised and self-supervised methods). Our code and pretrained model are available at https://github.com/Xiyue-Wang/TransPath. 相似文献

10.

Automatic brain labeling via multi-atlas guided fully convolutional networks

《Medical image analysis》2019

Multi-atlas-based methods are commonly used for MR brain image labeling, which alleviates the burdening and time-consuming task of manual labeling in neuroimaging analysis studies. Traditionally, multi-atlas-based methods first register multiple atlases to the target image, and then propagate the labels from the labeled atlases to the unlabeled target image. However, the registration step involves non-rigid alignment, which is often time-consuming and might lack high accuracy. Alternatively, patch-based methods have shown promise in relaxing the demand for accurate registration, but they often require the use of hand-crafted features. Recently, deep learning techniques have demonstrated their effectiveness in image labeling, by automatically learning comprehensive appearance features from training images. In this paper, we propose a multi-atlas guided fully convolutional network (MA-FCN) for automatic image labeling, which aims at further improving the labeling performance with the aid of prior knowledge from the training atlases. Specifically, we train our MA-FCN model in a patch-based manner, where the input data consists of not only a training image patch but also a set of its neighboring (i.e., most similar) affine-aligned atlas patches. The guidance information from neighboring atlas patches can help boost the discriminative ability of the learned FCN. Experimental results on different datasets demonstrate the effectiveness of our proposed method, by significantly outperforming the conventional FCN and several state-of-the-art MR brain labeling methods. 相似文献

11.

Instrument segmentation plays a vital role in 3D ultrasound (US) guided cardiac intervention. Efficient and accurate segmentation during the operation is highly desired since it can facilitate the operation, reduce the operational complexity, and therefore improve the outcome. Nevertheless, current image-based instrument segmentation methods are not efficient nor accurate enough for clinical usage. Lately, fully convolutional neural networks (FCNs), including 2D and 3D FCNs, have been used in different volumetric segmentation tasks. However, 2D FCN cannot exploit the 3D contextual information in the volumetric data, while 3D FCN requires high computation cost and a large amount of training data. Moreover, with limited computation resources, 3D FCN is commonly applied with a patch-based strategy, which is therefore not efficient for clinical applications. To address these, we propose a POI-FuseNet, which consists of a patch-of-interest (POI) selector and a FuseNet. The POI selector can efficiently select the interested regions containing the instrument, while FuseNet can make use of 2D and 3D FCN features to hierarchically exploit contextual information. Furthermore, we propose a hybrid loss function, which consists of a contextual loss and a class-balanced focal loss, to improve the segmentation performance of the network. With the collected challenging ex-vivo dataset on RF-ablation catheter, our method achieved a Dice score of 70.5%, superior to the state-of-the-art methods. In addition, based on the pre-trained model from ex-vivo dataset, our method can be adapted to the in-vivo dataset on guidewire and achieves a Dice score of 66.5% for a different cardiac operation. More crucially, with POI-based strategy, segmentation efficiency is reduced to around 1.3 seconds per volume, which shows the proposed method is promising for clinical use. 相似文献

12.

Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation

Supervised deep learning-based methods yield accurate results for medical image segmentation. However, they require large labeled datasets for this, and obtaining them is a laborious task that requires clinical expertise. Semi/self-supervised learning-based approaches address this limitation by exploiting unlabeled data along with limited annotated data. Recent self-supervised learning methods use contrastive loss to learn good global level representations from unlabeled images and achieve high performance in classification tasks on popular natural image datasets like ImageNet. In pixel-level prediction tasks such as segmentation, it is crucial to also learn good local level representations along with global representations to achieve better accuracy. However, the impact of the existing local contrastive loss-based methods remains limited for learning good local representations because similar and dissimilar local regions are defined based on random augmentations and spatial proximity; not based on the semantic label of local regions due to lack of large-scale expert annotations in the semi/self-supervised setting. In this paper, we propose a local contrastive loss to learn good pixel level features useful for segmentation by exploiting semantic label information obtained from pseudo-labels of unlabeled images alongside limited annotated images with ground truth (GT) labels. In particular, we define the proposed contrastive loss to encourage similar representations for the pixels that have the same pseudo-label/GT label while being dissimilar to the representation of pixels with different pseudo-label/GT label in the dataset. We perform pseudo-label based self-training and train the network by jointly optimizing the proposed contrastive loss on both labeled and unlabeled sets and segmentation loss on only the limited labeled set. We evaluated the proposed approach on three public medical datasets of cardiac and prostate anatomies, and obtain high segmentation performance with a limited labeled set of one or two 3D volumes. Extensive comparisons with the state-of-the-art semi-supervised and data augmentation methods and concurrent contrastive learning methods demonstrate the substantial improvement achieved by the proposed method. The code is made publicly available at https://github.com/krishnabits001/pseudo_label_contrastive_training. 相似文献

13.

The detection of nuclei and cells in histology images is of great value in both clinical practice and pathological studies. However, multiple reasons such as morphological variations of nuclei or cells make it a challenging task where conventional object detection methods cannot obtain satisfactory performance in many cases. A detection task consists of two sub-tasks, classification and localization. Under the condition of dense object detection, classification is a key to boost the detection performance. Considering this, we propose similarity based region proposal networks (SRPN) for nuclei and cells detection in histology images. In particular, a customised convolution layer termed as embedding layer is designed for network building. The embedding layer is added into the region proposal networks, enabling the networks to learn discriminative features based on similarity learning. Features obtained by similarity learning can significantly boost the classification performance compared to conventional methods. SRPN can be easily integrated into standard convolutional neural networks architectures such as the Faster R-CNN and RetinaNet. We test the proposed approach on tasks of multi-organ nuclei detection and signet ring cells detection in histological images. Experimental results show that networks applying similarity learning achieved superior performance on both tasks when compared to their counterparts. In particular, the proposed SRPN achieve state-of-the-art performance on the MoNuSeg benchmark for nuclei segmentation and detection while compared to previous methods, and on the signet ring cell detection benchmark when compared with baselines. The sourcecode is publicly available at: https://github.com/sigma10010/nuclei_cells_det. 相似文献

14.

DCAN: Deep contour-aware networks for object instance segmentation from histology images

《Medical image analysis》2017

In histopathological image analysis, the morphology of histological structures, such as glands and nuclei, has been routinely adopted by pathologists to assess the malignancy degree of adenocarcinomas. Accurate detection and segmentation of these objects of interest from histology images is an essential prerequisite to obtain reliable morphological statistics for quantitative diagnosis. While manual annotation is error-prone, time-consuming and operator-dependant, automated detection and segmentation of objects of interest from histology images can be very challenging due to the large appearance variation, existence of strong mimics, and serious degeneration of histological structures. In order to meet these challenges, we propose a novel deep contour-aware network (DCAN) under a unified multi-task learning framework for more accurate detection and segmentation. In the proposed network, multi-level contextual features are explored based on an end-to-end fully convolutional network (FCN) to deal with the large appearance variation. We further propose to employ an auxiliary supervision mechanism to overcome the problem of vanishing gradients when training such a deep network. More importantly, our network can not only output accurate probability maps of histological objects, but also depict clear contours simultaneously for separating clustered object instances, which further boosts the segmentation performance. Our method ranked the first in two histological object segmentation challenges, including 2015 MICCAI Gland Segmentation Challenge and 2015 MICCAI Nuclei Segmentation Challenge. Extensive experiments on these two challenging datasets demonstrate the superior performance of our method, surpassing all the other methods by a significant margin. 相似文献

15.

Rare bioparticle detection via deep metric learning

Shaobo Luo Yuzhi Shi Lip Ket Chin Yi Zhang Bihan Wen Ying Sun Binh T. T. Nguyen Giovanni Chierchia Hugues Talbot Tarik Bourouina Xudong Jiang Ai-Qun Liu 《RSC advances》2021,11(29):17603

Recent deep neural networks have shown superb performance in analyzing bioimages for disease diagnosis and bioparticle classification. Conventional deep neural networks use simple classifiers such as SoftMax to obtain highly accurate results. However, they have limitations in many practical applications that require both low false alarm rate and high recovery rate, e.g., rare bioparticle detection, in which the representative image data is hard to collect, the training data is imbalanced, and the input images in inference time could be different from the training images. Deep metric learning offers a better generatability by using distance information to model the similarity of the images and learning function maps from image pixels to a latent space, playing a vital role in rare object detection. In this paper, we propose a robust model based on a deep metric neural network for rare bioparticle (Cryptosporidium or Giardia) detection in drinking water. Experimental results showed that the deep metric neural network achieved a high accuracy of 99.86% in classification, 98.89% in precision rate, 99.16% in recall rate and zero false alarm rate. The reported model empowers imaging flow cytometry with capabilities of biomedical diagnosis, environmental monitoring, and other biosensing applications.

Conventional deep neural networks use simple classifiers to obtain highly accurate results. However, they have limitations in practical applications. This study demonstrates a robust deep metric neural network model for rare bioparticle detection. 相似文献

16.

Meta multi-task nuclei segmentation with fewer training samples

Volumetric white matter tract segmentation with nested self-supervised learning using sequential pretext tasks

Cells/nuclei deliver massive information of microenvironment. An automatic nuclei segmentation approach can reduce pathologists’ workload and allow precise of the microenvironment for biological and clinical researches. Existing deep learning models have achieved outstanding performance under the supervision of a large amount of labeled data. However, when data from the unseen domain comes, we still have to prepare a certain degree of manual annotations for training for each domain. Unfortunately, obtaining histopathological annotations is extremely difficult. It is high expertise-dependent and time-consuming. In this paper, we attempt to build a generalized nuclei segmentation model with less data dependency and more generalizability. To this end, we propose a meta multi-task learning (Meta-MTL) model for nuclei segmentation which requires fewer training samples. A model-agnostic meta-learning is applied as the outer optimization algorithm for the segmentation model. We introduce a contour-aware multi-task learning model as the inner model. A feature fusion and interaction block (FFIB) is proposed to allow feature communication across both tasks. Extensive experiments prove that our proposed Meta-MTL model can improve the model generalization and obtain a comparable performance with state-of-the-art models with fewer training samples. Our model can also perform fast adaptation on the unseen domain with only a few manual annotations. Code is available at https://github.com/ChuHan89/Meta-MTL4NucleiSegmentation 相似文献

17.

Multi-task recurrent convolutional network with correlation loss for surgical video analysis

《Medical image analysis》2020

Surgical tool presence detection and surgical phase recognition are two fundamental yet challenging tasks in surgical video analysis as well as very essential components in various applications in modern operating rooms. While these two analysis tasks are highly correlated in clinical practice as the surgical process is typically well-defined, most previous methods tackled them separately, without making full use of their relatedness. In this paper, we present a novel method by developing a multi-task recurrent convolutional network with correlation loss (MTRCNet-CL) to exploit their relatedness to simultaneously boost the performance of both tasks. Specifically, our proposed MTRCNet-CL model has an end-to-end architecture with two branches, which share earlier feature encoders to extract general visual features while holding respective higher layers targeting for specific tasks. Given that temporal information is crucial for phase recognition, long-short term memory (LSTM) is explored to model the sequential dependencies in the phase recognition branch. More importantly, a novel and effective correlation loss is designed to model the relatedness between tool presence and phase identification of each video frame, by minimizing the divergence of predictions from the two branches. Mutually leveraging both low-level feature sharing and high-level prediction correlating, our MTRCNet-CL method can encourage the interactions between the two tasks to a large extent, and hence can bring about benefits to each other. Extensive experiments on a large surgical video dataset (Cholec80) demonstrate outstanding performance of our proposed method, consistently exceeding the state-of-the-art methods by a large margin, e.g., 89.1% v.s. 81.0% for the mAP in tool presence detection and 87.4% v.s. 84.5% for F1 score in phase recognition. 相似文献

18.

Using Siamese capsule networks for remote sensing scene classification

Song Zhou Yong Zhou Bing Liu 《Remote sensing letters.》2020,11(8):757-766

ABSTRACT

The convolutional neural network (CNN) is widely used for image classification because of its powerful feature extraction capability. The key challenge of CNN in remote sensing (RS) scene classification is that the size of data set is small and images in each category vary greatly in position and angle, while the spatial information will be lost in the pooling layers of CNN. Consequently, how to extract accurate and effective features is very important. To this end, we present a Siamese capsule network to address these issues. Firstly, we introduce capsules to extract the spatial information of the features so as to learn equivariant representations. Secondly, to improve the classification accuracy of the model on small data sets, the proposed model utilizes the structure of the Siamese network as embedded verification. Finally, the features learned through Capsule networks are regularized by a metric learning term to improve the robustness of our model. The effectiveness of the model on three benchmark RS data sets is verified by different experiments. Experimental results demonstrate that the comprehensive performance of the proposed method surpasses other existing methods. 相似文献

19.

Generalizable multi-task,multi-domain deep segmentation of sparse pediatric imaging datasets via multi-scale contrastive regularization and multi-joint anatomical priors

White matter (WM) tract segmentation based on diffusion magnetic resonance imaging (dMRI) provides an important tool for the analysis of brain development, function, and disease. Deep learning based methods of WM tract segmentation have been proposed, which greatly improve the accuracy of the segmentation. However, the training of the deep networks usually requires a large number of manual delineations of WM tracts, which can be especially difficult to obtain and unavailable in many scenarios. Therefore, in this work, we explore how to perform deep learning based WM tract segmentation when annotated training data is scarce. To this end, we seek to exploit the abundant unannotated dMRI data in the self-supervised learning framework. From the unannotated data, knowledge about image context can be learned with pretext tasks that do not require manual annotations. Specifically, a deep network can be pretrained for the pretext task, and the knowledge learned from the pretext task is then transferred to the subsequent WM tract segmentation task with only a small number of annotated scans via fine-tuning. We explore two designs of pretext tasks that are related to WM tracts. The first pretext task predicts the density map of fiber streamlines, which are representations of generic WM pathways, and the training data can be obtained automatically with tractography. The second pretext task learns to mimic the results of registration-based WM tract segmentation, which, although inaccurate, is more relevant to WM tract segmentation and provides a good target for learning context knowledge. Then, we combine the two pretext tasks and develop a nested self-supervised learning strategy. In the nested self-supervised learning strategy, the first pretext task provides initial knowledge for the second pretext task, and the knowledge learned from the second pretext task with the initial knowledge is transferred to the target WM tract segmentation task via fine-tuning. To evaluate the proposed method, experiments were performed on brain dMRI scans from the Human Connectome Project dataset with various experimental settings. The results show that the proposed method improves the performance of WM tract segmentation when tract annotations are scarce. 相似文献

20.