跳到主要內容

臺灣博碩士論文加值系統

(44.200.86.95) 您好!臺灣時間:2024/05/22 14:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張博超
研究生(外文):Po-Chao Chang
論文名稱:自我監督學習之卷積神經網路設計與應用:細粒度圖像識別與圖像去背
論文名稱(外文):Self-supervised convolutional neural network design and application: fine-grained image recognition and image matting
指導教授:葉家宏葉家宏引用關係
指導教授(外文):Yeh,Chia-Hung
學位類別:碩士
校院名稱:國立中山大學
系所名稱:電機工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:英文
論文頁數:60
中文關鍵詞:卷積神經網路自我監督式學習細粒度圖像識別圖像去背電腦視覺深度學習
外文關鍵詞:convolutional neural networkself-supervised learningfine-grained image recognitionimage mattingcomputer visiondeep learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:136
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著電腦硬體的蓬勃發展,深度學習迎來了全面性的爆炸成長,同時電腦視覺在社交、娛樂、工業、軍事等領域都有大量研究與應用被持續的提出,深度學習在影像處理應用上的實現日趨成熟,已成為人工智慧未來的發展趨勢。本論文目的為研究基於自我監督式學習的卷積神經網路設計與應用,並以創新設計為網路帶來資料多樣性與避免有限資料的侷限性。首先本論文提出一個自我監督精煉與映射的細粒度圖像識別網路,透過特殊設計的輔助網路與重組圖像,使網路學習圖像中物體的結構空間訊息,提升圖像識別網路對於細粒度圖像的辨識能力。除了識別問題以外,本篇論文也基於自我監督式學習開發圖像去背網路,為了解決現有圖像去背網路所需的三色圖限制,我們設計一圖像上下文整合與漸進式粒度模塊,提升圖像去背網路對物體邊界區域的特徵提取能力,減少對三色圖的依賴程度,使得網路能夠達到真正不需要三色圖輔助的成果。實驗結果表明本論文所提出之細粒度圖像分類與圖像摳圖方法相較於各類國際標竿更具優異的效果。
With the fast-growing development in computer hardware, deep learning has been in an explosive growth. Many computer vision research in social, entertainment, industry, military have been continuously proposed with deep learning being applied. The deep learning-based applications have been matured and has become a trend. The purpose of this paper is to study the design and application of convolutional neural networks combined with self-supervised learning. In the paper, innovative designs are used to bring data diversity to the network and avoid the limitations of limited data. First, this paper proposes a self-supervised refining and mapping of fine-grained image recognition network. Through a specially designed auxiliary network and reconstructed images, the network can learn the structural spatial information of objects in the image and improve the network performance. In addition, this paper also develops an image matting network based on self-supervised learning. To overcome the trimap limitation required by the existing image matting network, we designed an image context integration and progressive granularity. The module enhances the feature extraction ability of the network at the boundary regions of objects, reducing the dependence on the trimap. The experimental results show that the proposed methods obtain competitive performance.
論文審定書 i
論文公開授權書 ii
中文摘要 iii
Abstract iv
Contents v
List of Figures vii
List of Tables viii
Chapter 1 1
1.1 Overview 1
1.2 Motivation 1
1.3 Contribution 3
1.4 Organization 5
Chapter 2 6
2.1 Fine-grained image recognition 6
2.2 Image matting 7
Chapter 3 10
3.1 Network Architecture 10
3.2 Stage 1. Coarse Recognition 12
3.3 Stage 2. Self-supervised Refining and Mapping 13
3.4 Stage 3. Feature Context Learning (FCL) 20
3.5 Experimental Results 22
(1) Implementation Details 23
(2) Comparisons with State-of-the-Art Methods 26
(3) Ablation Study 27
Chapter 4 30
4.1 Network Architecture 30
4.2 Jigsaw Generator 31
4.3 Contextual Attention Transfer Module 33
4.4 Progressive Granularity Module 35
4.5 Loss Function 37
4.6 Experimental Results 38
(1) Portrait Image Dataset 38
(2) Implementation Details 38
(3) Comparisons with State-of-the-Art Methods 39
(4) Ablation Study 43
Chapter 5 44
5.1 Conclusions 44
5.2 Future Works 45
Reference 46
[1] M. Tan, Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International Conference on Machine Learning, pp. 6105-6114, 2019.
[2] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, C. Xu, “Ghostnet: More features from cheap operations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580-1589, 2020.
[3] H. Chen, Y. Wang, C. Xu, B. Shi, C. Xu, Q. Tian, C. Xu, “AdderNet: Do we really need multiplications in deep learning?” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1468-1477, 2020.
[4] B. Zhao, X. Wu, J. Feng, Q. Peng, S. Yan, S. “Diversified visual attention networks for fine-grained object classification,” IEEE Transactions on Multimedia, 19(6), 1245-1256, 2017.
[5] H. Zheng, J. Fu, T. Mei, J. Luo, “Learning multi-attention convolutional neural network for fine-grained image recognition,” in Proceedings of the IEEE international conference on computer vision, pp. 5209-5217, 2017.
[6] T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, Z. Zhang, “The application of two-level attention models in deep convolutional neural network for fine-grained image classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 842-850, 2015.
[7] J. Hu, L. Shen, G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132-7141, 2018.
[8] S. Woo, J. Park, J. Y. Lee, I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision, pp. 3-19, 2018.
[9] C. H. Yeh, M. H. Lin, P. C. Chang, L. W. Kang, “Enhanced Visual Attention-Guided Deep Neural Networks for Image Classification,” IEEE Access, 8, 163447-163457, 2020.
[10] M. Sun, Y. Yuan, F. Zhou, E. Ding, “Multi-attention multi-class constraint for fine-grained image recognition,” in Proceedings of the European Conference on Computer Vision, pp. 805-821, 2018.
[11] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, “Fitnets: Hints for thin deep nets,” arXiv preprint arXiv:1412.6550, 2014.
[12] J. Fu, H. Zheng, T. Mei, “Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4438-4446, 2017.
[13] Y. Peng, X. He, J. Zhao, “Object-part attention model for fine-grained image classification,” IEEE Transactions on Image Processing, 27(3), 1487-1500, 2017.
[14] M. Zhou, Y. Bai, W. Zhang, T. Zhao, T. Mei, “Look-into-object: Self-supervised structure modeling for object recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11774-11783, 2020.
[15] Y. Chen, Y. Bai, W. Zhang, T. Mei, “Destruction and construction learning for fine-grained image recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5157-5166, 2019.
[16] Z. Yang, T. Luo, D. Wang, Z. Hu, J. Gao, L. Wang, “Learning to navigate for fine-grained classification,” in Proceedings of the European Conference on Computer Vision, pp. 420-435, 2018.
[17] J. Johnson, E. S. Varnousfaderani, H. Cholakkal, D. Rajan, “Sparse coding for alpha matting,” IEEE Transactions on Image Processing, 25(7), 3032-3043, 2016.
[18] E. S. Gastal, M. M. Oliveira, “Shared sampling for real‐time alpha matting,” in Computer Graphics Forum, vol. 29, no. 2, pp. 575-584, 2010.
[19] K. He, C. Rhemann, C. Rother, X. Tang, J. Sun, “A global sampling method for alpha matting,” in CVPR, pp. 2049-2056, 2011.
[20] X. Feng, X. Liang, Z. Zhang, “A cluster sampling method for image matting via sparse coding,” in European Conference on Computer Vision, pp. 204-219, 2016.
[21] L. Karacan, A. Erdem, E. Erdem, “image matting with KL-divergence based sparse sampling,” in Proceedings of the IEEE international conference on computer vision, pp. 424-432, 2015.
[22] Y. Y. Chuang, B. Curless, D. H. Salesin, R. Szeliski, “A bayesian approach to digital matting,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001.
[23] X. Bai, G. Sapiro, “A geodesic framework for fast interactive image and video segmentation and matting,” in IEEE 11th International Conference on Computer Vision, pp. 1-8, 2007.
[24] A. Levin, A. Rav-Acha, D. Lischinski, “Spectral matting,” IEEE transactions on pattern analysis and machine intelligence, pp. 1699-1712, 2008.
[25] J. Sun, J. Jia, C. K. Tang, H. Y. Shum, “Poisson matting,” in ACM, pp. 315-321, 2004.
[26] N. Xu, B. Price, S. Cohen, T. Huang, “Deep image matting,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2970-2979, 2017.
[27] Y. Li, H. Lu, ”Natural image matting via guided contextual attention,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 11450-11457, 2020.
[28] X. Shen, X. Tao, H. Gao, C. Zhou, J. Jia, “Deep automatic portrait matting,” in European conference on computer vision, pp. 92-107, 2016.
[29] S. Sengupta, V. Jayaram, B. Curless, S. M. Seitz, I. Kemelmacher-Shlizerman, “Background matting: The world is your green screen,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2291-2300, 2020.
[30] Q. Chen, T. Ge, Y. Xu, Z. Zhang, X. Yang, K. Gai, “Semantic human matting,” in Proceedings of the 26th ACM international conference on Multimedia, pp. 618-626, 2018.
[31] S. Lin, A. Ryabtsev, S. Sengupta, B. Curless, S. Seitz, I. Kemelmacher-Shlizerman, “Real-Time High-Resolution Background Matting,” arXiv preprint arXiv:2012.07810, 2020.
[32] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, pp. 618-626, 2017.
[33] J. Zhang, R. Zhang, Y. Huang, Q. Zou, “Unsupervised part mining for fine-grained image classification,” arXiv preprint arXiv:1902.09941, 2019.
[34] O. Ronneberger, P. Fischer, T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015.
[35] S. Sengupta, V. Jayaram, B. Curless, S. M. Seitz, I. Kemelmacher-Shlizerman, “Background matting: The world is your green screen,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2291-2300, 2020.
[36] K. He, J. Sun, X. Tang, “Guided image filtering,” in European conference on computer vision, pp. 1-14, 2010.
[37] J. T. Springenberg, A. Dosovitskiy, T. Brox, M. Riedmiller, “Striving for simplicity: The all convolutional net,” arXiv preprint arXiv:1412.6806, 2014.
[38] J. Krause, M., J. Deng, and L. Fei-Fei, “3d object representations for fine-grained categorization,” in Proceedings of the IEEE international conference on computer vision workshops, pages 554–561, 2013.
[39] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “The caltech-ucsd birds-200-2011 dataset,” 2011.
[40] S. Maji, E. Rahtu, J. Kannala, M. Blaschko, and A. Vedaldi, “Fine-grained visual classification of aircraft,” arXiv preprint arXiv:1306.5151, 2013.
[41] K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[42] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510-4520, 2018.
[43] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, L. Fei-Fei, “Imagenet large scale visual recognition challenge,” International journal of computer vision, 115(3), 211-252, 2015.
[44] I. Loshchilov and F. Hutter, “Sgdr: Stochastic gradient descent with warm restarts,” arXiv preprint arXiv:1608.03983, 2016.
[45] J. Krause, H. Jin, J. Yang, and L. Fei-Fei, “Fine-grained recognition without part annotations,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5546–5555, 2015.
[46] X. Liu, T. Xia, J. Wang, Y. Yang, F. Zhou, and Y. Lin, “Fully convolutional attention networks for fine-grained recognition,” arXiv preprint arXiv:1603.06765, 2016.
[47] T. Y. Lin, A. RoyChowdhury, and S. Maji, “Bilinear cnn models for fine-grained visual recognition,” in Proceedings of the IEEE international conference on computer vision, pages, 1449–1457, 2015.
[48] S. Cai, W. Zuo, and L. Zhang, “Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization,” in Proceedings of the IEEE International Conference on Computer Vision, pages 511–520, 2017.
[49] Y. Cui, F. Zhou, J. Wang, X. Liu, Y. Lin, and S. Belongie, “Kernel pooling for convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2930, 2017.
[50] S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M. H. Yang, L. Shao, “Multi-stage progressive image restoration,” arXiv preprint arXiv:2102.02808, 2021.
[51] H. Wu, Y. Qu, S. Lin, J. Zhou, R. Qiao, Z. Zhang, L. Ma, “Contrastive Learning for Compact Single Image Dehazing,” arXiv preprint arXiv:2104.09367, 2021.
[52] C. Yu, X. Zhao, Q. Zheng, P. Zhang, and X. You, “Hierarchical bilinear pooling for fine-grained visual recognition,” in Proceedings of the European conference on computer vision, pages 574–589, 2018.
[53] Y. Huo, Y. Lu, Y. Niu, Z. Lu, and J. R. Wen, “Coarse-to-fine grained classification,” in Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1033–1036, 2019.
[54] Y. Wang, V. I. Morariu, and L. S. Davis, “Learning a discriminative filter bank within a cnn for fine-grained recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4148–4157, 2018.
[55] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, T. S. Huang, “Generative image inpainting with contextual attention,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5505-5514, 2018.
[56] H. Zheng, J. Fu, Z. J. Zha, and J. Luo, “Learning deep bilinear transformation for fine-grained image representation,” arXiv preprint arXiv:1911.03621, 2019.
[57] Y. Gao, X. Han, X. Wang, W. Huang, and M. Scott, “Channel interaction networks for fine-grained image categorization,” in Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 10818–10825, 2020.
[58] D. Chang, Y. Ding, J. Xie, A. K. Bhunia, X. Li, Z. Ma, M. Wu, J. Guo, and Y. Zhe Song, “The devil is in the channels: Mutual-channel loss for fine-grained image classification,” IEEE Transactions on Image Processing, 29:4683–4695, 2020.
[59] K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[60] R. Du, D. Chang, A. K. Bhunia, J. Xie, Z. Ma, Y. Z. Song, J. Guo, “Fine-grained visual classification via progressive multi-granularity training of jigsaw patches,” in European Conference on Computer Vision, pp. 153-168, 2020.
[61] Q. Hou, F. Liu, “Context-aware image matting for simultaneous foreground and alpha estimation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4130-4139, 2019.
[62] Z. Yan, X. Li, M. Li, W. Zuo, S. Shan, “Shift-net: Image inpainting via deep feature rearrangement,” in Proceedings of the European conference on computer vision, pp. 1-17, 2018.
[63] Y. Zeng, J. Fu, H. Chao, B. Guo, “Learning pyramid-context encoder network for high-quality image inpainting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1486-1494, 2019.
[64] Q. Yu, J. Zhang, H. Zhang, Y. Wang, Z. Lin, N. Xu, A. Yuille, “Mask Guided Matting via Progressive Refinement Network,” arXiv: 2012.06722, 2020.
[65] Y. Qiao, Y. Liu, X. Yang, D. Zhou, M. Xu, Q. Zhang, X. Wei, “Attention-guided hierarchical structure aggregation for image matting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13676-13685, 2020.
[66] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, “Microsoft coco: Common objects in context,” in European conference on computer vision, pp. 740-755, 2014.
[67] Z. Ke, K. Li, Y. Zhou, Q. Wu, X. Mao, Q. Yan, R. W. Lau, “Is a Green Screen Really Necessary for Real-Time Portrait Matting?” arXiv preprint arXiv:2011.11961, 2020.
[68] H. Lu, Y. Dai, C. Shen, S. Xu, “Indices matter: Learning to index for deep image matting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3266-3275, 2019.
電子全文 電子全文(網際網路公開日期:20260804)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊