跳到主要內容

臺灣博碩士論文加值系統

(100.28.2.72) 您好!臺灣時間:2024/06/22 19:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃晉禹
研究生(外文):Jin-Yu Huang
論文名稱:基於學習並使用超像素對之影像分割
論文名稱(外文):Learning-Based Segmentation Using Superpixel Pairs
指導教授:丁建均丁建均引用關係
指導教授(外文):Jiang-Jiun Ding
口試委員:王鈺強郭景明張榮吉
口試委員(外文):Yu-Chiang WangJing-Ming GuoRong-Ji Zhang
口試日期:2020-06-24
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電信工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:96
中文關鍵詞:影像分割全卷積神經網路超像素
外文關鍵詞:Image SegmentationFully Convolutional NetworksSuperpixel
DOI:10.6342/NTU202001394
相關次數:
  • 被引用被引用:0
  • 點閱點閱:216
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近期,卷積神經網路(CNN)在圖像分割中已被廣泛採用。但是,現有的基於CNN的影像分割算法多是以單一像素為單位進行預測。由於超像素的不規則形狀和尺寸,很難將超像素直接應用於CNN架構。在本文中,我們提出了多種轉換的機制來使得CNN學習基於超像素的圖像分割。首先提出的算法採用包含兩個超像素的正方形影像作為CNN的輸入,然後CNN的輸出結果是兩個超像素是否應該合併。另外,即使只有很少的訓練圖像,我們的方法也可以從中獲得大量的訓練數據。在第一種算法的啟發下,我們進一步提出了第二種算法來從不同角度出發,該算法利用全卷積網絡(FCN)來解決影像切割的問題。提出的第二種算法將堆疊有彩色圖像的多通道圖像以及諸如超像素邊界圖和邊緣檢測結果的幾個特徵圖作為深度神經網絡的輸入,並輸出超像素邊界圖的預測,該預測圖提供了兩個相鄰超像素的邊界是否應該消失或保留,並進一步地讓我們去執行超像素合併算法。也就是說,通過一次解決第一個算法中的所有子問題,FCN以較大的幅度增進了整個分割過程的速度,同時獲得了較高的精度。總體而言,模擬結果顯示,兩種提出的算法都可以實現非常高精度的影像分割結果,並且在所有評估指標上均優於最新的圖像分割方法。
Recently, the CNN has been widely adopted in image segmentation. However, the
existing CNN-based segmentation algorithms are pixel-wise. It is hard to apply
superpixels into the CNN architectures directly due to the irregular shape and size of superpixels. In this paper, we proposed different kinds of transformation techniques that leverage the CNN for learning superpixel-based image segmentation. The first proposed algorithm takes a square patch that contains two superpixel as the input of the CNN, and then the output of the CNN is whether the two superpixels should be merged or not. Additionally, one can obtain huge amount of training data even if there are only a few training images. Inspired by the first algorithm, we further proposed the second algorithm that utilizes the fully convolutional networks (FCN) to solve the problem from different perspective.

The second proposed algorithm takes a multi-channel image consisted of stacked color image and several feature maps such as superpixel boundary map and edge detection result as the input of a deep neural network, and the network outputs the prediction of superpixel boundary map that indicates whether the boundary of two adjacent superpixel should be keep or not, in a way, merging suprepixels. That is, by solving all the subproblems with just one forward pass, the FCN facilitates the speed of the whole segmentation process by a wide margin meanwhile gaining higher accuracy. Overall, simulations show that both proposed algorithms can achieve highly accurate segmentation results and outperforms state-of-the-art image segmentation methods in all evaluation metrics.
Abstract i
List of Figures iv
List of Tables vii
1 Introduction 1
2 Related Work 5
2.1 Superpixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Mean Shift Superpixel . . . . . . . . . . . . . . . . . . . 5
2.1.2 Superpixel Generation with Segmentation-Aware Affinity
Loss (SEAL) Using Pixel Affinity Net (PAN) . . . . . . . 9
2.1.3 Superpixel Sampling Network (SSN) . . . . . . . . . . . 14
2.2 Classical Segmentation . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.1 Segmentation Using Superpixel (SAS) . . . . . . . . . . . 18
2.2.2 Hierarchical Image Segmentation . . . . . . . . . . . . . 23
2.3 Deep Learning in Image Segmentation . . . . . . . . . . . . . . . 26
2.3.1 Fully Convolutional Networks (FCN) . . . . . . . . . . . 26
3 Proposed Algorithms: DMMSS 28
3.1 Two-Superpixel Patch Generation . . . . . . . . . . . . . . . . . 30
3.2 Training Architecture . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Superpixel Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Merging Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.1 Segmentation Evaluation . . . . . . . . . . . . . . . . . . 38
3.5.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . 39
4 Proposed Algorithms: DMMSS-FCN 48
4.1 5-channel Input Data . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Output and GroundTruth . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Training Architecture . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Inference and Superpixel Merging . . . . . . . . . . . . . . . . . 57
4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5.1 Segmentation Evaluation . . . . . . . . . . . . . . . . . . 58
4.5.2 Run Time Analysis . . . . . . . . . . . . . . . . . . . . . 60
4.5.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . 61
5 Simulations 68
5.1 BSDS500 Test Images . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Real-World Images . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.1 Buildings . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.2 Animals . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.3 Night View . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2.4 Items and Objects . . . . . . . . . . . . . . . . . . . . . . 82
6 Conclusion 90
Reference 92
[1] F. Y. Shih and S. Cheng, “Automatic seeded region growing for color image segmentation,” Image and vision computing, vol. 23, no. 10, pp. 877–886, 2005.
[2] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, pp. 603–619, 2002.
[3] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “From contours to regions: An empirical evaluation,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009, pp. 2294–2301.
[4] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 5, pp.898–916, 2010.
[5] T. Cour, F. Benezit, and J. Shi, “Spectral segmentation with multiscale graph decomposition,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2. IEEE, 2005, pp. 1124–1131.
[6] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 8, pp. 888–905, 2000.
[7] P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International journal of computer vision, vol. 59, no. 2, pp. 167–181, 2004.
[8] Z. Li, X.-M. Wu, and S.-F. Chang, “Segmentation using superpixels: A bipartite graph partitioning approach,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012, pp. 789–796.
[9] T. H. Kim, K. M. Lee, and S. U. Lee, “Learning full pairwise affinities for spectral segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 7, pp. 1690–1703, 2012.
[10] Y. Yang, Y. Wang, and X. Xue, “A novel spectral clustering method with superpixels for image segmentation,” Optik, vol. 127, no. 1, pp. 161–167, 2016.
[11] X. Xia and B. Kulis, “W-net: A deep model for fully unsupervised image segmentation,” arXiv preprint arXiv:1711.08506, 2017.
[12] H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1520–1528.
[13] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
[14] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017.
[15] M.-Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, “Entropy rate superpixel segmentation,” in CVPR 2011. IEEE, 2011, pp. 2097–2104.
[16] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S¨usstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,” IEEE transactions on pattern analysis and machine intelligence, vol. 34, no. 11, pp. 2274–2282, 2012.
[17] V. Jampani, D. Sun, M.-Y. Liu, M.-H. Yang, and J. Kautz, “Superpixel sampling networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 352–368.
[18] W.-C. Tu, M.-Y. Liu, V. Jampani, D. Sun, S.-Y. Chien, M.-H. Yang, and J. Kautz, “Learning superpixels with segmentation-aware affinity loss,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 568–576.
[19] S. Xie and Z. Tu, “Holistically-nested edge detection,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1395–1403.
[20] D. Haehn, V. Kaynig, J. Tompkin, J. W. Lichtman, and H. Pfister, “Guided proofreading of automatic segmentations for connectomics,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[21] N. Agrawal, P. Sinha, A. Kumar, and S. Bagai, “Fast & dynamic image restoration using laplace equation based image inpainting,” J Undergraduate Res Innovation, vol. 1, no. 2, pp. 115–123, 2015.
[22] A. P. Kelm, V. S. Rao, and U. Zolzer, “Object contour and edge detection with refinecontournet,” in International Conference on Computer Analysis of Images and Patterns. Springer, 2019, pp. 246–258.
[23] D. J. Field, “Relations between the statistics of natural images and the response properties of cortical cells,” Josa a, vol. 4, no. 12, pp. 2379–2394, 1987.
[24] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[25] M. Donoser and D. Schmalstieg, “Discrete-continuous gradient orientation estimation for faster image segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3158–3165.
[26] C. J. Taylor, “Towards fast and accurate segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 1916–1922.
[27] M. Everingham and J. Winn, “The pascal visual object classes challenge 2012 (voc2012) development kit,” Pattern Analysis, Statistical Modelling and Computational Learning, Tech. Rep, 2011.
[28] P. Doll´ar and C. L. Zitnick, “Structured forests for fast edge detection,” in Proceedings of the IEEE international conference on computer vision, 2013, pp. 1841–1848.
[29] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv preprint arXiv:1706.05587, 2017.
[30] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Thirty-first AAAI conference on artificial intelligence, 2017.
[31] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊