跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.44) 您好!臺灣時間:2026/01/02 21:18
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:卓士軒
研究生(外文):Cho, Shih-Hsuan
論文名稱:基於循環收縮聚合技術之室內場景彩色深度影像語意分割
論文名稱(外文):Semantic Segmentation of Indoor-Scene RGB-D Images Based on Iterative Contraction and Merging
指導教授:王聖智王聖智引用關係
指導教授(外文):Wang, Sheng-Jyh
口試委員:辛正和蕭旭峰
口試日期:2017-01-19
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電子研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:55
中文關鍵詞:語意分割影像分割彩色深度影像循環收縮聚合
外文關鍵詞:semantic segmentationimage segmentationRGB-D Imagesiterativelycontractivemerging
相關次數:
  • 被引用被引用:0
  • 點閱點閱:389
  • 評分評分:
  • 下載下載:33
  • 收藏至我的研究室書目清單書目收藏:0
對於室內場景語意分割的議題,我們提出了一個方法同時結合了卷積神經網路(convolutional neural network)和循環收縮聚合(iteratively contractive merging)技術,同時也利用深度影像資訊幫助分析室內場景空間的資訊,我們利用兩個雙邊濾波器對有缺少資訊的深度影像進行填補以及對整張影像平滑化。循環收縮聚合技術是一種非監督式影像分割技術且同時能保有不錯的邊界資訊,我們利用原有的特點,對該技術增加許多更高階的資訊,例如:卷積神經網路之語意分割結果、深度影像、法向量圖,藉此使得循環收縮聚合技術從高解析度到低解析度的過程更趨向於語意分割的結果,最終我們能得到一個語意階層分割樹(hierarchical segmentation tree)。我們同時也提出了一個決策方法針對室內場景語意分割的問題,藉由卷積神經網路提供粗略的語意分割結果,對階層分割樹中另外找到一個較精細且最佳的語意分割結果。在實驗結果中,我們的結果對於卷積神經網路之語意分割結果相比有較佳的物件邊界,同時在提出的方法中,我們也證明更多的高階的資訊能對循環收縮聚合技術生成更佳的室內場景語意分割結果。
For semantic segmentation of indoor-scene images, we propose a method which combines convolutional neural network (CNNs) and the Iterative Contraction & Merging (ICM) algorithm. We also simultaneously utilize the depth images to efficiently analyze the 3-D space in indoor-scene images. The raw depth image from the depth camera is processed by two bilateral filters to recover a smoother and more complete depth image. On the other hand, the ICM algorithm is an unsupervised segmentation method that can preserve the boundary information well. We utilize the dense prediction from CNN, depth image and normal vector map as the high-level information to guide the ICM process for generating image segments in a more accurate way. In other words, we progressively generate the regions from high resolution to low resolution and generate a hierarchical segmentation tree. We also propose a decision process to determine the final decision of the semantic segmentation based on the hierarchical segmentation tree by using the dense prediction map as a reference. The proposed method can generate more accurate object boundaries as compared to the state-of-the-art methods. Our experiments also show that the use of high-level information does improve the performance of semantic segmentation as compared to the use of RGB information only.
Chapter 1 Introduction 1
1.1 Motivation 2
1.2 Contribution 3
1.3 Thesis Outline 4
Chapter 2 Background 5
2.1 Bilateral filter 5
2.2 Convolutional Neural Network 6
2.2.1 Fully convolutional Neural Network 7
2.2.2 Fully convolutional Neural Network with RGB-D Data 8
2.3 Hierarchical Segmentation 11
Chapter 3 Proposed Method 12
3.1 Depth Recovery 13
3.1.1 Bilateral filter for recover depth image 14
3.1.2 Bilateral filter for smooth depth image 15
3.2 Dense Prediction 16
3.3 Iterative Contraction and Merging 17
3.3.1 Part 1:Pixel-wise Contraction and Merging 18
3.3.2 Part 2:Region-wise Contraction and Merging 22
3.3.3 Final Decision Process (DP) 32
Chapter 4 Experimental Results 43
4.1 Database 43
4.2 Preprocess of raw depth image 43
4.3 Implementation Details 45
4.4 Contribution of feature components to ICM 45
4.5 Comparison with the state-of-the-art methods 47
4.6 Failure Cases 49
Chapter 5 Conclusion and Future Works 51
5.1 Conclusion 51
5.2 Future Works 52
Bibliography 53
[1] Ren, X., Bo, L., Fox, D., "RGB-(D) scene labeling: Features and algorithms," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012.
[2] Silberman, N., Hoiem, D., Kohli, P., Fergus, R., "Indoor segmentation and support inference from rgbd images," in European Conference on Computer Vision, 2012.
[3] Gupta, S., Arbelaez, P., Malik, J., "Perceptual organization and recognition of indoor scenes from rgb-d images," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013.
[4] Silberman, Nathan, and Rob Fergus. "Indoor scene segmentation using a structured light sensor," in Proceedings of the IEEE International Conference on Computer Vision, 2011.
[5] Couprie, C., Farabet, C., Najman, L., LeCun, Y., "Indoor semantic segmentation using depth information," CoRR abs/1301.3572, 2013.
[6] Gupta, Saurabh, et al., "Learning rich features from RGB-D images for object detection and segmentation," European Conference on Computer Vision. Springer International Publishing, 2014.
[7] Deng, Zhuo, Sinisa Todorovic, and Longin Jan Latecki., "Semantic segmentation of rgbd images with mutex constraints," Proceedings of the IEEE International Conference on Computer Vision, 2015.
[8] J. Long, E. Shelhamer, and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, 2015.
[9] J. D. Lafferty, A. McCallum, and F. C. N. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data," in Proceedings of the eighteenth international conference on machine learning, vol. 1, pp. 282-289, 2001.
[10] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs," arXiv preprint arXiv:1412.7062, 2014.
[11] G. Papandreou, L.-C. Chen, K. Murphy, and A. L. Yuille, "Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation," arXiv preprint arXiv:1502.02734, 2015.
[12] S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr, "Conditional Random Fields as Recurrent Neural Networks," in Proceedings of the IEEE International Conference on Computer Vision, pp. 1529-1537, 2015.
[13] A. Arnab, S. Jayasumana, S. Zheng, and P. Torr, "Higher Order Potentials in End-to-End Trainable Conditional Random Fields," arXiv preprint arXiv:1511.08119, 2015.
[14] Z. Liu, X. Li, P. Luo, C. C. Loy, and X. Tang, "Semantic Image Segmentation via Deep Parsing Network," in Proceedings of the IEEE International Conference on Computer Vision, pp. 1377-1385, 2015.
[15] G. Lin, C. Shen, I. Reid et al., "Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation," arXiv preprint arXiv:1504.01013, 2015.
[16] F. Yu and V. Koltun, "Multi-Scale Context Aggregation by Dilated Convolutions," arXiv preprint arXiv:1511.07122, 2015.
[17] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs," arXiv preprint arXiv:1606.00915, 2016.
[18] Z. Wu, C. Shen, and A. van den Hengel, "High-Performance Semantic Segmentation Using Very Deep Fully Convolutional Networks," arXiv preprint arXiv:1604.04339, 2016.
[19] G. Ghiasi and C. C. Fowlkes, "Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation," in European Conference on Computer Vision. Springer International Publishing, pp. 519-534, 2016.
[20] J. Syn, S. Wang, L. Wung, "Hierarchical Image Segmentation based on Iterative Contraction and Merging," IEEE Transaction on Image Process, December, 2016.
[21] P. Arbelaez. "Boundary Extraction in Natural Images Using Ultrametric Contour Maps," in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2006.
[22] P. Arbelaez, M. Maire, C. Fowlkes and J. Malik, "Contour Detection and Hierarchical Image Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no.5 , pp. 898-916, 2011.
[23] T. H. Kim, K. M. Lee and S. U. Lee, "Learning Full Pairwise Affinities for Spectral Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 7, pp. 1690-1703, 2013.
[24] Tang, Shuai, et al. "Histogram of oriented normal vectors for object recognition with a depth sensor." Asian conference on computer vision. Springer Berlin Heidelberg, 2012.
[25] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei, "ImageNet large scale visual recognition challenge," International Journal of ComputerVision, vol. 115, no. 3, pp. 211-252, 2015.
[26] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems. 2012.
[27] Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks," European Conference on Computer Vision. Springer International Publishing, 2014.
[28] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556 , 2014.
[29] Szegedy, Christian, et al. "Going deeper with convolutions," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[30] He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385, 2015.
[31] C. Y. Tseng and S. J. Wang, "Learning-Based Hierarchical Graph for Unsupervised Matting and Foreground Estimation," IEEE Transactions on Image Processing, vol. 23, no. 12, pp. 4941-4953, 2014.
[32] J. Chen, S. Shan, C. He, G. Zhao, M. Pietikäinen, X. Chen, and W. Gao, "WLD: A Robust Local Image Descriptor," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1705-1720, 2010.
[33] https://github.com/shelhamer/fcn.berkeleyvision.org.
[34] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, "Caffe: Convolutional Architecture for Fast Feature Embedding," in Proceedings of the 22nd ACM International Conference on Multimedia, pages 675– 678, 2014.
[35] Tomasi, Carlo, and Roberto Manduchi. "Bilateral filtering for gray and color images." Computer Vision, 1998. Sixth International Conference on. IEEE, 1998.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top