(3.215.77.193) 您好!臺灣時間:2021/04/17 00:53
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:王亭幼
研究生(外文):Ting-You Wang
論文名稱:運用多元卷積類神經網路架構於多光譜遙測影像之物質分類
論文名稱(外文):Material Classification in Multispectral Remote Sensing Image Using Multiple Convolutional Neural Network Architectures
指導教授:林春宏林春宏引用關係
學位類別:碩士
校院名稱:國立臺中科技大學
系所名稱:資訊工程系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:98
中文關鍵詞:地形重建遙測影像多光譜影像卷積類神經網路物質分類
外文關鍵詞:terrain reconstructionremote sensing imagemultispectral imageconvolutional neural networkmaterial classification
相關次數:
  • 被引用被引用:0
  • 點閱點閱:86
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
為了達到實境模擬之效果,地形模型必須結合各種物質以及紋理資訊,因此地形重建(terrain reconstruction)在模擬三維地形(three-dimensional numerical simulation of terrain)中,將扮演重要的角色。然而,若採用傳統方式建置模型,往往會耗費龐大的人力與時間。因此,本研究採用卷積類神經網路(convolutional neural network, CNN)架構進行多光譜遙測影像的物質分類,以便簡化未來模型的建置。本研究之多光譜遙測影像包含RGB可見光、近紅外線(near infrared, NIR)、常態化差異植被指數(normalized difference vegetation index, NDVI)以及數值地表模型(digital surface model, DSM)影像。

本文提出運用多元卷積類神經網路架構之RUNet模型,進行多光譜遙測影像之物質分類。RUNet模型是基於改良式U-Net的架構,並結合ResNet模型的Shortcut Connections方法來保留淺層網路提取的特徵。RUNet架構分為編碼層(encoder)與解碼層(decoder),編碼層包含10個卷積層及4個池化層,解碼層則有4個上取樣層、8個卷積層及1個分類卷積層。本文的物質分類流程包含RUNet模型的訓練和測試,由於遙測影像的尺寸較大,訓練過程係從訓練集隨機裁切相同尺寸的子影像,再輸入至RUNet模型訓練。測試過程為了考量物質的空間資訊,係透過鏡射補值(mirror padding)和重疊裁切(overlap cropping)的方式,從測試集中裁切出多張測試子影像,再由RUNet對子影像進行物質分類,最後再將子影像的分類結果合併回原測試影像。

為了評估本文方法之效能,係採用Inria、Inria-2和ISPRS遙測影像資料集,並由RUNet模型進行物質分類實驗,亦分析鏡射補值和重疊裁切方法之效果,以及子影像尺寸對物質分類的影響。結果顯示,在Inria資料集的實驗中,RUNet之分類結果經過形態學優化之後,總體IoU達到約70.82%,準確率約為95.66%,優於其他研究方法。Inria-2資料集實驗的分類結果,在優化之後總體IoU約為75.5%,準確率約為95.71%,雖然改良式FCN有較佳的結果,但RUNet模型所花費的訓練時間較少。而在ISPRS資料集的實驗中,結合多光譜、NDVI和DSM影像的總體準確率達到約89.71%,優於使用RGB影像的分類結果。NIR和DSM能夠提供更多的物質特徵資訊,有效改善RGB影像中因為相同顏色、形狀或紋理特徵而造成的分類混淆。經過實驗證明,本文方法在遙測影像的物質分類,相較其他研究方法有較佳的結果,期望未來能夠應用於模擬系統之模型建置、土地利用監測,以及災害評估等領域。
In order to achieve the effect of real-world simulation, terrain models must combine various material and texture information, so that terrain reconstruction may play an important role in the three-dimensional numerical simulation of terrain. However, if the model is built in the traditional way, it will often cost a lot in terms of manpower and time. Therefore, this study uses a convolutional neural network (CNN) architecture to classify material in multispectral remote sensing images to simplify the construction of future models. The multispectral remote sensing image of this study includes RGB visible light, near infrared (NIR), normalized difference vegetation index (NDVI) and digital surface model (DSM) images.

This paper proposes the use of the RUNet model of multiple convolutional neural network architectures for material classification. The RUNet model is based on an improved U-Net architecture combined with the Shortcut Connections approach of the ResNet model to preserve the features of shallow network extraction. The architecture is divided into an encoding layer and a decoding layer. The encoding layer includes 10 convolution layers and 4 pooling layers. The decoding layer has 4 upsampling layers, 8 convolution layers, and one classified convolution layer. The material classification process in this paper includes the training and testing of the RUNet model. Due to the large size of the remote sensing image, the training process randomly cuts sub-images of the same size from the training set and then inputs them into the RUNet model for training. In order to consider the spatial information of the material, the test process cuts multiple test sub-images from the test set by mirror padding and overlap cropping, the RUNet then classifies the sub-images, and finally merges the sub-image classification results back into the original test image.

In order to evaluate the effectiveness of the method, the Inria, Inria-2 and ISPRS remote sensing image datasets were used, and the RUNet model was used for material classification experiments. The effects of the mirror padding and overlap cropping methods were also analyzed, as well as the impact of sub-image size on material classification. The results showed that in the Inria dataset experiment, after the morphological optimization of RUNet, the overall IoU reached about 70.82%, and the accuracy rate was about 95.66%, which was better than the results in other research methods. The classification results of the Inria-2 dataset experiment showed that the overall IoU was about 75.5% and the accuracy was about 95.71% after optimization. Although the improved FCN has better results, the RUNet model takes less training time. In the ISPRS dataset experiment, the overall accuracy of combining multispectral, NDVI and DSM images reached approximately 89.71%, which is superior to the classification results using RGB images. NIR and DSM can provide more material features information, effectively improving the classification confusion caused by the same color, shape or texture features in RGB images. Experiments have prove that the material classification of our method in remote sensing images achieves better results than other research methods do, and it is expected to be applied to the model construction of the simulation system, land use monitoring, and disaster assessment in the future.
摘要 i
ABSTRACT iii
誌謝 v
目次 vi
表目次 ix
圖目次 xi
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 文獻回顧 4
1.4 研究目的 5
1.5 論文架構 6
第二章 相關研究探討 7
2.1 卷積類神經網路(Convolutional Neural Network, CNN) 7
2.1.1 卷積層(Convolution Layer) 8
2.1.1.1 填補(Padding) 9
2.1.1.2 線性整流函數(Rectified Linear Unit, ReLU) 10
2.1.1.3 Batch Normalization 10
2.1.2 池化層(Pooling Layer) 12
2.1.3 全連接層(Fully Connected Layer) 12
2.1.4 損失函數(Loss Function) 13
2.2 VGG卷積類神經網路模型 14
2.3 殘差網路(Deep Residual Network, ResNet) 16
第三章 本文研究方法 17
3.1 物質分類模型 17
3.1.1 全卷積網路(Fully Convolutional Networks, FCN) 19
3.1.1.1 FCN編碼層 20
3.1.1.2 FCN解碼層 21
3.1.2 改良式FCN模型 23
3.1.2.1 改良式FCN編碼層 24
3.1.2.2 改良式FCN解碼層 25
3.1.3 SegNet模型 26
3.1.3.1 SegNet編碼層 27
3.1.3.2 SegNet解碼層 27
3.1.4 U-Net模型 28
3.1.4.1 U-Net編碼層 29
3.1.4.2 U-Net解碼層 29
3.1.5 改良式U-Net模型 30
3.1.5.1 改良式U-Net編碼層 31
3.1.5.2 改良式U-Net解碼層 31
3.1.6 RUNet模型 32
3.1.6.1 Shortcut Connections 33
3.1.6.2 RUNet編碼層 34
3.1.6.3 RUNet解碼層 35
3.2 模型訓練 36
3.2.1 影像前置處理 36
3.2.2 損失函數 36
3.3 模型測試 38
3.3.1 鏡射補值(Mirror Padding) 40
3.3.2 重疊裁切(Overlap Cropping) 40
第四章 實驗結果 43
4.1 實驗環境 43
4.2 影像資料集 43
4.2.1 Inria遙測影像資料集 44
4.2.1.1 建築物特徵 46
4.2.1.2 影像前置處理 48
4.2.2 ISPRS資料集 49
4.2.2.1 物質類別特徵 52
4.2.2.2 影像前置處理 58
4.3 評估標準 60
4.4 實驗結果與分析 61
4.4.1 模型參數設定 61
4.4.2 Inria與Inria-2資料集之效能分析 61
4.4.2.1 Inria資料集之初步效能評估 62
4.4.2.2 Inria-2資料集之初步效能評估 65
4.4.2.3 建築物分類之優化與效能分析 69
4.4.2.4 鏡射補值與重疊裁切之效能分析 72
4.4.3 ISPRS資料集之效能分析 74
4.4.3.1 RGB影像之物質分類結果 74
4.4.3.2 多光譜影像之物質分類結果 77
4.4.3.3 多光譜與NDVI影像之物質分類結果 81
4.4.3.4 DSM影像之物質分類結果 85
4.4.3.5 影像尺寸之效能分析 89
第五章 結論與未來展望 92
參考文獻 94
[1]江孟璁(2009)。房屋模型面與空載影像之套合(碩士論文)。取自http://handle.ncl.edu.tw/11296/d759wc。搜尋日期:2019年1月14日。
[2]Habib, A. F., and Alruzouq, R. I. (2004). Line‐based modified iterated Hough transform for automatic registration of multi‐source imagery. The Photogrammetric Record, 19(105), 5-21.
[3]Bentoutou, Y., Taleb, N., Kpalma, K., and Ronsin, J. (2005). An automatic image registration for applications in remote sensing. IEEE Transactions on Geoscience and Remote Sensing, 43(9), 2127-2137.
[4]Habib, A., Ghanma, M., and Mitishita, E. (2004). Co-registration of photogrammetric and LIDAR data: Methodology and case study. Revista Brasileira de Cartografia, 1(56).
[5]施介嵐(2003)。以光譜混合分析法進行台灣地區 Master 影像之研究(碩士論文)。取自http://hdl.handle.net/11536/49269。搜尋日期:2019年1月14日。
[6]Harsanyi, J. C., and Chang, C. I. (1994). Hyperspectral image classification and dimensionality reduction: An orthogonal subspace projection approach. IEEE Transactions on geoscience and remote sensing, 32(4), 779-785.
[7]國家太空中心。福爾摩沙衛星五號。 取自https://www.nspo.narl.org.tw/tw2015/projects/FORMOSAT-5/image.html。搜尋日期:2018年12月12日。
[8]Gruen, A., and Wang, X. (1998). CC-Modeler: a topology generator for 3-D city models. ISPRS Journal of Photogrammetry and Remote Sensing, 53(5), 286-295.
[9]Rau, J. Y., and Chen, L. C. (2003). Robust reconstruction of building models from three-dimensional line segments. Photogrammetric Engineering and Remote Sensing, 69(2), 181-188.
[10]Chen, L. C., Teo, T. A., Rau, J. Y., Liu, J. K., and Hsu, W. C. (2005, July). Building reconstruction from LIDAR data and aerial imagery. In Geoscience and Remote Sensing Symposium, 2005. IGARSS''05. Proceedings. 2005 IEEE International (Vol. 4, pp. 2846-2849).
[11]Tsai, F., and Lin, H. C. (2007). Polygon‐based texture mapping for cyber city 3D building models. International Journal of Geographical Information Science, 21(9), 965-981.
[12]Wu, J., Zhou, G., Ding, F., and Liu, Z. (2007, August). Automatic retrieval of optimal texture from aerial video for photo-realistic 3D visualization of street environment. In Image and Graphics, 2007. ICIG 2007. Fourth International Conference on (pp. 943-947).
[13]Früh, C., and Zakhor, A. (2003). Constructing 3D City Models by Merging Aerial and Ground Views. IEEE computer graphics and applications, 23(6), 52-61.
[14]Catmull, E. (1975, May). Computer Display ofCurved Surfaces. In Proceedings ofthe IEEE conference on Computer Graphics, Pattern Recognition and Data Structures (Vol. 11).
[15]Chon, J., Fuse, T., and Shimizu, E. (2004). Urban visualization through video mosaics based on 3-D multibaselines. International Archive of Photogrammetry and Remote Sensing, 35(B3), 727-731.
[16]傅秉綱(2002)。三維建物模型表面影像敷貼自動化之研究(碩士論文)。 取自http://handle.ncl.edu.tw/11296/u6wrnr。搜尋日期:2019年1月14日。
[17]Frueh, C., Jain, S., and Zakhor, A. (2005). Data processing algorithms for generating textured 3D building facade meshes from laser scans and camera images. International Journal of Computer Vision, 61(2), 159-184.
[18]Weinhaus, F. M., and Devich, R. N. (1999). Photogrammetric texture mapping onto planar polygons. Graphical Models and Image Processing, 61(2), 63-83.
[19]Ghamisi, P., and Benediktsson, J. A. (2015). Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geoscience and Remote Sensing Letters, 12(2), 309-313.
[20]MacQueen, J. (1967, June). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, No. 14, pp. 281-297).
[21]Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Süsstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence, 34(11), 2274-2282.
[22]Ball, G. H., and Hall, D. J. (1965). ISODATA, a novel method of data analysis and classification. Tech. Rep.. Stanford University, Stanford, CA.
[23]Bezdek, J. C., Ehrlich, R., and Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers and Geosciences, 10(2-3), 191-203.
[24]Romero, A., Gatta, C., and Camps-Valls, G. (2016). Unsupervised deep feature extraction for remote sensing image classification. IEEE Transactions on Geoscience and Remote Sensing, 54(3), 1349-1362.
[25]Jackson, Q., and Landgrebe, D. A. (2002). Adaptive Bayesian contextual classification based on Markov random fields. IEEE Transactions on Geoscience and Remote Sensing, 40(11), 2454-2463.
[26]Bengio, Y., Courville, A., and Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
[27]Long, J., Shelhamer, E., and Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
[28]Badrinarayanan, V., Kendall, A., and Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12), 2481-2495.
[29]Ronneberger, O., Fischer, P., and Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.
[30]Laptev, I., Mayer, H., Lindeberg, T., Eckstein, W., Steger, C., and Baumgartner, A. (2000). Automatic extraction of roads from aerial images based on scale space and snakes. Machine Vision and Applications, 12(1), 23-31.
[31]Alshehhi, R., and Marpu, P. R. (2017). Hierarchical graph-based segmentation for extracting road networks from high-resolution satellite images. ISPRS journal of photogrammetry and remote sensing, 126, 245-260.
[32]Anwer, R. M., Khan, F. S., van de Weijer, J., Molinier, M., and Laaksonen, J. (2018). Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification. ISPRS journal of photogrammetry and remote sensing, 138, 74-85.
[33]Alshehhi, R., Marpu, P. R., Woon, W. L., and Dalla Mura, M. (2017). Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS Journal of Photogrammetry and Remote Sensing, 130, 139-149.
[34]Kang, J., Körner, M., Wang, Y., Taubenböck, H., and Zhu, X. X. (2018). Building instance classification using street view images. ISPRS Journal of Photogrammetry and Remote Sensing, 145, 44-59.
[35]LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature, 521(7553), 436.
[36]Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
[37]Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[38]Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3), 211-252.
[39]He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[40]Po-Chih, H. Easiest Fully Convolutional Networks. Retrieved January 29, 2019, from https://github.com/pochih/FCN-pytorch.
[41]Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., and Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255).
[42]Milesial. Pytorch implementation of the U-Net for image semantic segmentation, with dense CRF post-processing. Retrieved January 30, 2019, from https://github.com/milesial/Pytorch-UNet.
[43]Kingma, D. P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[44]Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (2017, July). Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 3226-3229).
[45]ISPRS. 2D Semantic Labeling Contest. Retrieved February 20, 2019, from www2.isprs.org/commissions/comm3/wg4/semantic-labeling.html.
[46]Nemani, R. R., and Running, S. W. (1989). Estimation of regional surface resistance to evapotranspiration from NDVI and thermal-IR AVHRR data. Journal of Applied meteorology, 28(4), 276-284.
[47]Ohleyer, S. (2018). Building segmentation on satellite images. Retrieved June 12, 2019, from https://project. inria. fr/aerialimagelabeling/files/2018/01/fp_ohleyer_c ompressed. pdf.
[48]Volpi, M., and Tuia, D. (2016). Dense semantic labeling of subdecimeter resolution images with convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 55(2), 881-893.
[49]Nogueira, K., Dalla Mura, M., Chanussot, J., Schwartz, W. R., and dos Santos, J. A. (2019). Dynamic Multi-Context Segmentation of Remote Sensing Images based on Convolutional Networks. arXiv preprint arXiv:1804.04020.
[50]Mou, L., and Zhu, X. X. (2018). RiFCN: Recurrent network in fully convolutional network for semantic segmentation of high resolution remote sensing images. arXiv preprint arXiv:1805.02091.
[51]Liu, Y., Piramanayagam, S., Monteiro, S. T., and Saber, E. (2017). Dense semantic labeling of very-high-resolution aerial imagery and lidar with fully-convolutional neural networks and higher-order CRFs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 76-85).
[52]Chen, G., Zhang, X., Wang, Q., Dai, F., Gong, Y., and Zhu, K. (2018). Symmetrical dense-shortcut deep fully convolutional networks for semantic segmentation of very-high-resolution remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(5), 1633-1644.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔