Abstract
Medical image segmentation is a crucial preliminary step for a number of downstream diagnosis tasks. As deep convolutional neural networks successfully promote the development of computer vision, it is possible to make medical image segmentation a semi-automatic procedure by applying deep convolutional neural networks to finding the contours of regions of interest that are then revised by radiologists. However, supervised learning necessitates large annotated data, which are difficult to acquire especially for medical images. Self-supervised learning is able to take advantage of unlabeled data and provide good initialization to be finetuned for downstream tasks with limited annotations. Considering that most self-supervised learning especially contrastive learning methods are tailored to natural image classification and entail expensive GPU resources, we propose a novel and simple pretext-based self-supervised learning method that exploits the value of positional information in volumetric medical images. Specifically, we regard spatial coordinates as pseudo labels and pretrain the model by predicting positions of randomly sampled 2D slices in volumetric medical images. Experiments on four semantic segmentation datasets demonstrate the superiority of our method over other self-supervised learning methods in both semi-supervised learning and transfer learning settings. Codes are available at https://github.com/alienzyj/PPos.
摘要
医学图像分割是许多下游诊断任务中的关键步骤。随着深度卷积神经网络极大地促进了计算 机视觉的发展, 半自动化的医学图像分割方法逐渐成熟, 即通过应用深度卷积神经网络来检测感兴 趣区域, 然后由放射科医生进行修改。然而, 有监督学习需要大量的人工标注, 这些标注数据很难 获得, 特别是在医学图像领域。自监督学习能够利用无标签数据, 为模型提供良好的初始化参数, 然后在带标签数据量有限的下游任务上进行微调。考虑到大多数自监督学习特别是对比学习主要应 用于自然图像领域, 并且在预训练过程中需要昂贵的GPU资源, 我们提出了一种新颖而简单的基于 辅助任务的自监督学习方法, 该方法利用了三维医学图像中的位置信息。具体来说, 我们将二维切 片在三维坐标系中的纵坐标作为伪标签, 在预训练阶段以模型预测该标签作为辅助任务。我们在四 个语义分割数据集上证明了本文的方法在医学图像分割任务中优于其他自监督学习方法。代码已在 https://github.com/alienzyj/PPos 公开。
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
TAGHANAKI S A, ABHISHEK K, COHEN J P, et al. Deep semantic segmentation of natural and medical images: A review [J]. Artificial Intelligence Review, 2021, 54(1): 137–178.
ZHANG S, XU J C, CHEN Y C, et al. Revisiting 3D context modeling with supervised pre-training for universal lesion detection in CT slices [M]//Medical image computing and computer assisted intervention — MICCAI 2020. Cham: Springer, 2020: 542–551.
JING L L, TIAN Y L. Self-supervised visual feature learning with deep neural networks: A survey [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(11): 4037–4058.
CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations [C]//37th International Conference on Machine Learning. Vienna: IMLS, 2020: 1597–1607.
HE K M, FAN H Q, WU Y X, etal. Momentumcontrast for unsupervised visual representation learning [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 9726–9735.
GRILL J B, STRUB F, ALTCHÉ F, et al. Bootstrap your own latent: A new approach to self-supervised learning [C]//34th Conference on Neural Information Processing Systems. Vancouver: NIPS, 2020: 21271–21284.
WU Z R, XIONG Y J, YU S X, et al. Unsupervised feature learning via non-parametric instance discrimination [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 3733–3742.
Chaitanya K, Erdil E, Karani N, et al. Contrastive learning of global and local features for medical image segmentation with limited annotations [C]//34th Conference on Neural Information Processing Systems. Vancouver: NIPS, 2020: 12546–12558.
ZENG D W, WU Y W, HU X R, et al. Positional contrastive learning for volumetric medical image segmentation [M]//Medical image computing and computer assisted intervention—MICCAI 2021. Cham: Springer, 2021: 221–230.
RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation [M]//Medical image computing and computer-assisted intervention —MICCAI 2015. Cham: Springer, 2015: 234–241.
MILLETARI F, NAVAB N, AHMADI S A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation [C]//2016 Fourth International Conference on 3D Vision. Stanford: IEEE, 2016: 565–571.
ÇIÇEK Ö, ABDULKADIR A, LIENKAMP S S, et al. 3D U-net: Learning dense volumetric segmentation from sparse annotation [M]//Medical image computing and computer-assisted intervention—MICCAI 2016. Cham: Springer, 2016: 424–432.
LOU A, GUAN S, LOEW M. DC-UNet: Rethinking the U-Net architecture with dual channel efficient CNN for medical image segmentation [C]//Medical Imaging 2021: Image Processing. Online: SPIE, 2021, 11596: 758–768.
ZHOU Z W, SIDDIQUEE MMR, TAJBAKHSH N, et al. UNet: Redesigning skip connections to exploit multiscale features in image segmentation [J]. IEEE Transactions on Medical Imaging, 2020, 39(6): 1856–1867.
ISENSEE F, JAEGER P F, KOHL S A A, et al. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation [J]. Nature Methods, 2021, 18(2): 203–211.
NOROOZI M, FAVARO P. Unsupervised learning of visual representations by solving jigsaw puzzles [M]//Computer vision —ECCV 2016. Cham: Springer, 2016: 69–84.
DOERSCH C, GUPTA A, EFROS A A. Unsupervised visual representation learning by context prediction [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1422–1430.
ZHANG R, ISOLA P, EFROS A A. Colorful image colorization [M]//Computer vision — ECCV 2016. Cham: Springer International Publishing, 2016: 649–666.
PATHAK D, KRÄHENBÜHL P, DONAHUE J, et al. Context encoders: Feature learning by inpainting [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2536–2544.
KHOSLA P, TETERWAK P, WANG C, et al. Supervised contrastive learning [C]//34th Conference on Neural Information Processing Systems. Vancouver: NIPS, 2020: 18661–18673.
CHEN X L, HE K M. Exploring simple Siamese representation learning [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 15745–15753.
ZHOU Z W, SODHA V, RAHMAN SIDDIQUEE M M, et al. Models genesis: generic autodidactic models for 3D medical image analysis [M]//Medical image computing and computer assisted intervention— MICCAI 2019. Cham: Springer, 2019: 384–393.
ZHOU Z W, SODHA V, PANG J X, et al. Models genesis [J]. Medical Image Analysis, 2021, 67: 101840.
ZHUANG X R, LI Y X, HU Y F, et al. Self-supervised feature learning for 3D medical images by playing a rubik’ [M]//Medical image computing and computer assisted intervention —MICCAI 2019. Cham: Springer, 2019: 420–428.
ZHU J W, LI Y X, HU Y F, et al. Rubik’s Cube+: A self-supervised feature learning framework for 3D medical image analysis [J]. Medical Image Analysis, 2020, 64: 101746.
HAGHIGHI F, TAHER M R H, ZHOU Z W, et al. Transferable visual words: Exploiting the semantics of anatomical patterns for self-supervised learning [J]. IEEE Transactions on Medical Imaging, 2021, 40(10): 2857–2868.
YAN K, LU L, SUMMERS R M. Unsupervisedbody part regression via spatially self-ordering convolutional neural networks [C]//2018 IEEE 15th International Symposium on Biomedical Imaging. Washington: IEEE, 2018: 1022–1025.
LI Z H, ZHANG S, ZHANG J G, et al. MVP-net: Multi-view FPN with position-aware attention for deep universal lesion detection [M]//Medical image computing and computer assisted intervention — MICCAI 2019. Cham: Springer, 2019: 13–21.
XU X W, WANG T C, SHI Y Y, et al. Whole heart and great vessel segmentation in congenital heart disease using deep neural networks and graph matching [M]//Medical image computing and computer assisted intervention—MICCAI 2019. Cham: Springer, 2019: 477–485.
ZHUANG X H. Challenges and methodologies of fully automatic whole heart segmentation: A review [J]. Journal of Healthcare Engineering, 2013, 4(3): 371–408.
ZHUANG X H, SHEN J. Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI [J]. Medical Image Analysis, 2016, 31: 77–87.
QIN X B, ZHANG Z C, HUANG C Y, et al. U2-Net: Going deeper with nested U-structure for salient object detection [J]. Pattern Recognition, 2020, 106: 107404.
HE K M, ZHANG X Y, REN S Q, etal. Deepresidual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318–327.
GIDARIS S, SINGH P, KOMODAKIS N. Unsupervised representation learning by predicting image rotations [C]//6th International Conference on Learning Representations. Vancouver: ICLR, 2018: 1–16.
MISRA I, VAN DER MAATEN L. Self-supervised learning of pretext-invariant representations [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 6706–6716.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Foundation item: the Major Research Plan of the National Natural Science Foundation of China (No. 92059206)
Rights and permissions
About this article
Cite this article
Zhao, Y., Hou, R., Zeng, W. et al. Positional Information is a Strong Supervision for Volumetric Medical Image Segmentation. J. Shanghai Jiaotong Univ. (Sci.) (2023). https://doi.org/10.1007/s12204-023-2614-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12204-023-2614-y