Abstract
Domain adaptation for semantic segmentation requires pixel-level knowledge transfer from a labeled source domain to an unlabeled target domain. Existing approaches typically align the features of the source and target domains at different levels. However, they usually neglect the different adaptive complexities of different information flows within images. In this paper, we focus on combining two main information flows in semantic segmentation, ie., the pixel-level disparate information and image structure information. Specifically, we propose to combine two feature map-based prediction heads, which are thought to focus on pixel-level and structure-level information, to accommodate different complexities by adjusting the attention to adaptation functions of the target domain. We then align the outputs from the two heads through a consistency regularization to realize informative complementarity. The combined prediction head further enables regularizing the distance between different pixel representations of different classes, thereby mitigating the mis-adaptation problem of similar classes. The proposed method can achieve more competitive results than current state-of-the-art results on two publicly available benchmark datasets, ie., SYNTHIA \(\rightarrow \) Cityscapes and GTA5 \(\rightarrow \) Cityscapes.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Chen C, Xie W, Huang W et al (2019) Progressive feature alignment for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 627–636
Chen LC, Papandreou G, Kokkinos I et al (2017) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen W, Hu H (2020) Unsupervised domain adaptation via discriminative classes-center feature learning in adversarial network. Neural Process Lett 52(1):467–483
Chen Y (2022) Semantic image segmentation with feature fusion based on Laplacian pyramid. Neural Process Lett 54:4153–4170
Chen YC, Lin YY, Yang MH et al (2019) CrDoCo: pixel-level domain transfer with cross-domain consistency. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1791–1800
Choi J, Kim T, Kim C (2019) Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6830–6840
Cordts M, Omran M, Ramos S et al (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Dundar A, Liu MY, Wang TC et al (2018) Domain stylization: a strong, simple baseline for synthetic to real image domain adaptation. arXiv preprint arXiv:1807.09384
Everingham M, Eslami SA, Van Gool L et al (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
Ganin Y, Ustinova E, Ajakan H et al (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(59):1–35
Goodfellow I, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27:1–9
HassanPour Zonoozi M, Seydi V (2022) A survey on adversarial domain adaptation. Neural Process Lett. https://doi.org/10.1007/s11063-022-10977-5
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoffman J, Wang D, Yu F et al (2016) FCNs in the wild: pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649
Hoffman J, Tzeng E, Park T et al (2018) CyCADA: cycle-consistent adversarial domain adaptation. In: International conference on machine learning, PMLR, pp 1989–1998
Kim M, Byun H (2020) Learning texture invariant representation for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12,975–12,984
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations, pp 1–11
Kothandaraman D, Nambiar AM, Mittal A (2021) Domain adaptive knowledge distillation for driving scene semantic segmentation. In: WACV (Workshops), pp 134–143
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lee CY, Batra T, Baig MH et al (2019) Sliced Wasserstein discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10,285–10,295
Li S, Xie B, Zang B et al (2021) Semantic distribution-aware contrastive adaptation for semantic segmentation. arXiv preprint arXiv:2105.05013
Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6936–6945
Liu S, Zhang H, Shao L et al (2021) Built-in depth-semantic coupled encoding for scene parsing, vehicle detection and road segmentation. IEEE Trans Intell Transp Syst 22(9):5520–5534
Liu W, Li J, Liu B et al (2021) Unified cross-domain classification via geometric and statistical adaptations. Pattern Recognit 110(107):658
Liu Y, Deng J, Gao X et al (2021) BAPA-Net: boundary adaptation and prototype alignment for cross-domain semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8801–8811
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Luo Y, Liu P, Guan T et al (2019) Significance-aware information bottleneck for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6778–6787
Luo Y, Zheng L, Guan T et al (2019) Taking a closer look at domain shift: category-level adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2507–2516
Luo Y, Liu P, Zheng L et al (2021) Category-level adversarial adaptation for semantic segmentation using purified features. IEEE Trans Pattern Anal Mach Intell 44(8):3940–3956
Melas-Kyriazi L, Manrai AK (2021) PixMatch: unsupervised domain adaptation via pixelwise consistency training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12,435–12,445
Murez Z, Kolouri S, Kriegman D et al (2018) Image to image translation for domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4500–4509
Pan F, Shin I, Rameau F et al (2020) Unsupervised intra-domain adaptation for semantic segmentation through self-supervision. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3764–3773
Paul S, Tsai YH, Schulter S et al (2020) Domain adaptive semantic segmentation using weak labels. In: European conference on computer vision, Springer, pp 571–587
Pizzati F, Charette Rd, Zaccaria M et al (2020) Domain bridge for unpaired image-to-image translation and unsupervised domain adaptation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2990–2998
Reinhard E, Adhikhmin M, Gooch B et al (2001) Color transfer between images. IEEE Comput Gr Appl 21(5):34–41
Richter SR, Vineet V, Roth S et al (2016) Playing for data: ground truth from computer games. In: European conference on computer vision, pp 102–118
Ros G, Sellart L, Materzynska J et al (2016) The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3234–3243
Saito K, Watanabe K, Ushiku Y et al (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3723–3732
Sankaranarayanan S, Balaji Y, Jain A et al (2018) Learning from synthetic data: Addressing domain shift for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3752–3761
Toldo M, Michieli U, Agresti G et al (2020) Unsupervised domain adaptation for mobile semantic segmentation based on cycle consistency and feature alignment. Image Vis Comput 95(103):889
Tsai YH, Hung WC, Schulter S et al (2018) Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7472–7481
Tsai YH, Sohn K, Schulter S et al (2019) Domain adaptation for structured output via discriminative patch representations. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1456–1465
Vu TH, Jain H, Bucher M et al (2019) ADVENT: adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2517–2526
Weng L, Wang Y, Gao F (2022) Traffic scene perception based on joint object detection and semantic segmentation. Neural Process Lett 54:5333–5349
Wu Z, Wang X, Gonzalez JE et al (2019) ACE: adapting to changing environments for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2121–2130
Yang J, Xu R, Li R et al (2020) An adversarial perturbation oriented domain adaptation approach for semantic segmentation. In: Proceedings of the AAAI conference on artificial intelligence, pp 12,613–12,620
Yang J, Zou H, Zhou Y et al (2020) Mind the discriminability: asymmetric adversarial domain adaptation. In: European conference on computer vision, pp 589–606
Yang Y, Soatto S (2020) FDA: Fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4085–4095
Yu J, Tao D, Wang M (2012) Adaptive hypergraph learning and its application in image classification. IEEE Trans Image Process 21(7):3262–3272
Yu J, Yao J, Zhang J et al (2020) SPRNet: single-pixel reconstruction for one-stage instance segmentation. IEEE Trans Cybern 51(4):1731–1742
Zhang H, Long Y, Liu L et al (2019) Adversarial unseen visual feature synthesis for zero-shot learning. Neurocomputing 329:12–20
Zhang J, Yang J, Yu J et al (2022) Semisupervised image classification by mutual learning of multiple self-supervised models. Int J Intell Syst 37(5):3117–3141
Zhang P, Zhang B, Zhang T et al (2021) Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12,414–12,424
Zhang Q, Zhang J, Liu W et al (2019) Category anchor-guided unsupervised domain adaptation for semantic segmentation. Adv Neural Inf Process Syst 32:1–11
Zhang W, Zhang X, Lan L et al (2020) Maximum mean and covariance discrepancy for unsupervised domain adaptation. Neural Process Lett 51(1):347–366
Zhang X, Chen Y, Zhang H et al (2021) When visual disparity generation meets semantic segmentation: a mutual encouragement approach. IEEE Trans Intell Transp Syst 22(3):1853–1867
Zhang X, Zhang H, Lu J et al (2021) Target-targeted domain adaptation for unsupervised semantic segmentation. In: 2021 IEEE international conference on robotics and automation (ICRA), pp 13,560–13,566
Zhang X, Chen Y, Shen Z et al (2022) Confidence-and-refinement adaptation model for cross-domain semantic segmentation. IEEE Trans Intell Transp Syst 23(7):9529–9542
Zhang Y, Ye M, Gan Y et al (2020) Knowledge based domain adaptation for semantic segmentation. Knowl Based Syst 193(105):444
Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Zhou D, Fang J, Song X et al (2020a) Joint 3d instance segmentation and object detection for autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1839–1849
Zhou L, Zhang H, Long Y et al (2019) Depth embedded recurrent predictive parsing network for video scenes. IEEE Trans Intell Transp Syst 20(12):4643–4654
Zhou W, Wang Y, Chu J et al (2020) Affinity space adaptation for semantic segmentation across domains. IEEE Trans Image Process 30:2549–2561
Acknowledgements
This work was in part supported by the National Natural Science Foundation of China (Nos. 61872187, 62072246), in part supported by the Natural Science Foundation of Jiangsu Province (No. BK20201306), and in part by the “111” Program under Grant No. B13022.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bi, X., Chen, D., Huang, H. et al. Combining Pixel-Level and Structure-Level Adaptation for Semantic Segmentation. Neural Process Lett 55, 9669–9684 (2023). https://doi.org/10.1007/s11063-023-11220-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11220-5