跳到主要內容

臺灣博碩士論文加值系統

(44.220.247.152) 您好!臺灣時間:2024/09/20 19:57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:楊岳霖
研究生(外文):YANG, YUE-LIN
論文名稱:基於距離引導類別混合用於無監督領域自適應
論文名稱(外文):DCMix: Distance Based ClassMix for Unsupervised Domain Adaptation
指導教授:江振國
指導教授(外文):CHIANG, CHEN-KUO
口試委員:郭至恩江振國邱志義林維暘
口試委員(外文):KUO, CHIH-ENCHIANG, CHEN-KUOCHIU, CHIH-YILIN, WEI-YANG
口試日期:2024-07-26
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:英文
論文頁數:48
中文關鍵詞:無監督領域自適應語義分割自學習類別混合
外文關鍵詞:Unsupervised Domain AdaptationSemantic SegmentationSelf-TrainingClassMix
相關次數:
  • 被引用被引用:0
  • 點閱點閱:8
  • 評分評分:
  • 下載下載:1
  • 收藏至我的研究室書目清單書目收藏:0
在無監督領域自適應(Unsupervised Domain Adaptation, UDA)的任務中,其中兩個領域的數據與特徵分布不同,共同的目標是利用源域數據和目標域數據來訓練,使得模型能夠將知識轉移到目標域上並且有良好的性能。現存的UDA方法中,經常將兩個領域的數據進行混合,這種混合方法有助於模型更好地適應不同領域的圖像內容,但此混合方法確忽略了混合圖片時的合理性,尤其在Pixel-Level的語義分割問題中,不合理的混合會影響到模型對於類別間的認知。因此,本文提出基於距離引導類別混合的方式(DCMix),透過事先對源數據計算全局類別距離,配合單張源數據影像的局部類別距離,引導模型混合的位置,除了可以創造更合理的資料外,還能避免模型對類別之間的關聯性產生偏差。此外,我們還設計了類別關係的計算方式,透過計算類別間的關係可以幫助模型了解類別間的相依性,進而提升混合影像的合理性。初步實驗結果顯示,DCMix顯著提升了混合影像的合理性,並且在常見的UDA benchmark中展現出具競爭力的成果。
In Unsupervised Domain Adaptation (UDA), the data and feature distributions of the two domains are different. The common objective is to utilize both the source and target domain data for training, enabling the model to transfer knowledge to the target domain and perform well. However, most methods overlook the issue of class imbalance, leading to a model biased towards the majority class during the learning process. On the other hand, the treatment of regions with varying levels of importance within images is often neglected, resulting in insufficient training for certain challenging classes. Therefore, this thesis focuses on semantic segmentation and proposes a Rare-Class Mixing approach that enhances the selection probability of rare classes during the cross-domain mixing adaptation process. Simultaneously, Prototype-Guided Masking of target domain images is employed to train the model, enhancing its perception of different regions within the images based on a higher emphasis on challenging classes. By incorporating curriculum learning as a training strategy, the model can progressively learn from easy samples to difficult samples. Experimental results demonstrate that our UDA method achieves state-of-the-art performance on several semantic segmentation datasets with varying domain gaps.
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Unsupervised Domain Adaptation(UDA) . . . . . . . . . . . . . . . . . 6
2.2 Mixing-based Data Augmentation . . . . . . . . . . . . . . . . . . . . . 6
2.3 ClassMix Based UDA Method . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Segmentation Mask to Bounding Box . . . . . . . . . . . . . . . 11
3.2.2 Class Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 Class Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.4 Class Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Supervised Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Pseudo-label Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 DCMix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5.1 Calculate Local Distance . . . . . . . . . . . . . . . . . . . . . . 17
3.5.2 Sliding Window Process . . . . . . . . . . . . . . . . . . . . . . 18
3.5.3 Local Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.4 Global Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.5 Mixed Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6 Masked Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.7 Total Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Visualization Mixed Result for True Labels . . . . . . . . . . . . . . . . 27
4.5 Visualization Mixed Result for Training . . . . . . . . . . . . . . . . . . 29
4.6 Method Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.6.1 Comparison of UDA benchmark . . . . . . . . . . . . . . . . . . 34
4.6.2 Comparison of Mixing-based Data Augmentation . . . . . . . . 35
4.7 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.7.1 Evaluation of Global and Local Distance . . . . . . . . . . . . . 37
4.7.2 Reasons for Using Global Distance . . . . . . . . . . . . . . . . 38
4.7.3 Evaluation of Top K Class . . . . . . . . . . . . . . . . . . . . . 42
Chapter 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43


[1] Y. Wang, H. Wang, Y. Shen, J. Fei, W. Li, G. Jin, L. Wu, R. Zhao, and X. Le,
“Semi-supervised semantic segmentation using unreliable pseudo-labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4248–4257, 2022. 1
[2] L. Yang, W. Zhuo, L. Qi, Y. Shi, and Y. Gao, “St++: Make self-training work better for semi-supervised semantic segmentation,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pp. 4268–4277, 2022. 1
[3] M. Zheng, S. You, L. Huang, F. Wang, C. Qian, and C. Xu, “Simmatch: Semisupervised learning with similarity matching,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pp. 14471–14481, 2022.
1
[4] L. Yang, L. Qi, L. Feng, W. Zhang, and Y. Shi, “Revisiting weak-to-strong
consistency in semi-supervised semantic segmentation,” in Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7236–
7246, 2023. 1
[5] X. Chen, Y. Yuan, G. Zeng, and J. Wang, “Semi-supervised semantic segmentation
with cross pseudo supervision,” in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, pp. 2613–2622, 2021. 1
[6] J. Li, C. Xiong, and S. C. Hoi, “Comatch: Semi-supervised learning with contrastive graph regularization,” in Proceedings of the IEEE/CVF International
Conference on Computer Vision, pp. 9475–9484, 2021. 1
[7] Y. Li, L. Yuan, and N. Vasconcelos, “Bidirectional learning for domain adaptation of semantic segmentation,” in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, pp. 6936–6945, 2019. 2, 6
[8] F. Pizzati, R. d. Charette, M. Zaccaria, and P. Cerri, “Domain bridge for unpaired
image-to-image translation and unsupervised domain adaptation,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2990–
2998, 2020. 2, 6
[9] R. Gong, W. Li, Y. Chen, D. Dai, and L. Van Gool, “Dlow: Domain flow and applications,” International Journal of Computer Vision, vol. 129, no. 10, pp. 2865–
2888, 2021. 2, 6
[10] T.-H. Vu, H. Jain, M. Bucher, M. Cord, and P. P´erez, “Advent: Adversarial
entropy minimization for domain adaptation in semantic segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
pp. 2517–2526, 2019. 2, 6, 34
[11] K. Saito, K. Watanabe, Y. Ushiku, and T. Harada, “Maximum classifier discrepancy for unsupervised domain adaptation,” in Proceedings of the IEEE conference
on computer vision and pattern recognition, pp. 3723–3732, 2018. 2, 6
[12] Y. Luo, P. Liu, L. Zheng, T. Guan, J. Yu, and Y. Yang, “Category-level adversarial
adaptation for semantic segmentation using purified features,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 3940–3956, 2021.
2, 6
[13] B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,” in Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14, pp. 443–450,
Springer, 2016. 2, 6
[14] M. Long, Z. Cao, J. Wang, and M. I. Jordan, “Conditional adversarial domain
adaptation,” Advances in neural information processing systems, vol. 31, 2018. 2,
6
[15] K. Mei, C. Zhu, J. Zou, and S. Zhang, “Instance adaptive self-training for unsupervised domain adaptation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16,
pp. 415–430, Springer, 2020. 2, 6
[16] N. Araslanov and S. Roth, “Self-supervised augmentation consistency for adapting
semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15384–15394, 2021. 2, 6
[17] J. Choi, T. Kim, and C. Kim, “Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation,” in Proceedings of the
IEEE/CVF International Conference on Computer Vision, pp. 6830–6840, 2019.
2, 6
[18] L. Melas-Kyriazi and A. K. Manrai, “Pixmatch: Unsupervised domain adaptation
via pixelwise consistency training,” in Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pp. 12435–12445, 2021. 2, 6
[19] W. Tranheden, V. Olsson, J. Pinto, and L. Svensson, “Dacs: Domain adaptation via cross-domain mixed sampling,” in Proceedings of the IEEE/CVF Winter
Conference on Applications of Computer Vision, pp. 1379–1389, 2021. 2, 6, 8, 34
[20] Q. Zhou, Z. Feng, Q. Gu, J. Pang, G. Cheng, X. Lu, J. Shi, and L. Ma, “Contextaware mixup for domain adaptive semantic segmentation,” IEEE Transactions on
Circuits and Systems for Video Technology, vol. 33, no. 2, pp. 804–817, 2022. 2, 6
[21] V. Olsson, W. Tranheden, J. Pinto, and L. Svensson, “Classmix: Segmentationbased data augmentation for semi-supervised learning,” in Proceedings of the
IEEE/CVF winter conference on applications of computer vision, pp. 1369–1378,
2021. 3, 7
[22] Y. Zou, Z. Yu, B. Kumar, and J. Wang, “Unsupervised domain adaptation for
semantic segmentation via class-balanced self-training,” in Proceedings of the European conference on computer vision (ECCV), pp. 289–305, 2018. 3
[23] Y. N. D. D. L.-P. Hongyi Zhang, Moustapha Cisse, “mixup: Beyond empirical
risk minimization,” International Conference on Learning Representations, 2018.
6
[24] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization
strategy to train strong classifiers with localizable features,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 6023–6032, 2019. 6,
7
[25] H. K. Choi, J. Choi, and H. J. Kim, “Tokenmixup: Efficient attention-guided
token-level data augmentation for transformers,” Advances in Neural Information
Processing Systems, vol. 35, pp. 14224–14235, 2022. 6
[26] A. Galdran, G. Carneiro, and M. A. Gonz´alez Ballester, “Balanced-mixup for
highly imbalanced medical image classification,” in Medical Image Computing and
Computer Assisted Intervention – MICCAI 2021 (M. de Bruijne, P. C. Cattin,
S. Cotin, N. Padoy, S. Speidel, Y. Zheng, and C. Essert, eds.), (Cham), pp. 323–
333, Springer International Publishing, 2021. 7
[27] R. Takahashi, T. Matsubara, and K. Uehara, “Ricap: Random image cropping
and patching data augmentation for deep cnns,” in Asian conference on machine
learning, pp. 786–798, PMLR, 2018. 7
[28] T. Hong, Y. Wang, X. Sun, F. Lian, Z. Kang, and J. Ma, “Gradsalmix: Gradient
saliency-based mix for image data augmentation,” in 2023 IEEE International
Conference on Multimedia and Expo (ICME), pp. 1799–1804, IEEE, 2023. 7
[29] L. Hoyer, D. Dai, H. Wang, and L. Van Gool, “Mic: Masked image consistency
for context-enhanced domain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11721–11732, 2023. 9,
34
[30] S. R. Richter, V. Vineet, S. Roth, and V. Koltun, “Playing for data: Ground
truth from computer games,” in Computer Vision–ECCV 2016: 14th European
Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part
II 14, pp. 102–118, Springer, 2016. 25
[31] G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia
dataset: A large collection of synthetic images for semantic segmentation of urban
scenes,” in Proceedings of the IEEE conference on computer vision and pattern
recognition, pp. 3234–3243, 2016. 25
[32] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson,
U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban
scene understanding,” in Proceedings of the IEEE conference on computer vision
and pattern recognition, pp. 3213–3223, 2016. 25
[33] P. Zhang, B. Zhang, T. Zhang, D. Chen, Y. Wang, and F. Wen, “Prototypical
pseudo label denoising and target structure learning for domain adaptive semantic
segmentation,” in Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition, pp. 12414–12424, 2021. 34
[34] L. Hoyer, D. Dai, and L. Van Gool, “Daformer: Improving network architectures
and training strategies for domain-adaptive semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
pp. 9924–9935, 2022. 34
[35] L. Hoyer, D. Dai, and L. Van Gool, “Hrda: Context-aware high-resolution domainadaptive semantic segmentation,” in European Conference on Computer Vision,
pp. 372–391, Springer, 2022. 34
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊