(34.204.185.54) 您好!臺灣時間:2021/04/11 06:49
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳建宏
研究生(外文):Chien-Hung Chen
論文名稱:以基於區域的卷積神經網路實現空拍影片之魟魚偵測和辨識
論文名稱(外文):Stingray Detection and Recognition of Aerial Videos with Region-based Convolution Neural Network
指導教授:劉耿豪
指導教授(外文):Keng-Hao Liu
學位類別:碩士
校院名稱:國立中山大學
系所名稱:機械與機電工程學系研究所
學門:工程學門
學類:機械工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:106
語文別:中文
論文頁數:103
中文關鍵詞:深度學習物件偵測卷積神經網路空拍影像機器視覺
外文關鍵詞:Object DetectionConvolution Neural NetworkDeep LearningMachine VisionAerial Imaging
相關次數:
  • 被引用被引用:5
  • 點閱點閱:287
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:75
  • 收藏至我的研究室書目清單書目收藏:0
由於近年來影像處理技術因為深度學習的崛起而有了驚人的發展,許多過去處理起來棘手的問題現在都有了轉機。在生態研究領域,生態學家經常利用攝影器材輔助蒐集影像或影片資料,經過人工判讀與統計後進行分析。有些資料判讀的過程對人而言極為耗時,隨著蒐集資料的增加,研究進度往往會被拖慢。我們所遇到的狀況為生態研究者利用空拍機對東沙群島沿海執行遙測高空攝影,來進行魟魚統計分析的工作。傳統上,資料蒐集結束後需要花費大量人工時間觀看影片,用肉眼定位並統計出魟魚的大小與數目。研究者希望能使用電腦自動化判讀來取代人力,但由於魟魚在的形狀與顏色與背景相當接近,使用一般物件偵測影像技術並無法有效捕捉到魟魚。本論文嘗試發展一個基於深度學習的電腦自動化的魟魚偵測方法,該法使用區域性卷積神經網路為基本單張影像物件偵測模型,並導入物件移動軌跡的可預測性以及時間軸上物件位置的一致性等特徵進行後處理,整合成適用於偵測影片中移動物件的方法。本研究的目標是發展一套能自動處理空拍魟魚影片偵測的軟體,希望未來能節省生態研究者在整理資料上所耗費的時間。
In recent years, image processing technology has made a major breakthrough because of the appearance of deep learning. Nowadays, many problems that were difficult to deal with in the past can be resolved by using deep learning methods. Image processing has been used as an assistant tool in many different kinds of research fields. In ecological research, researchers usually utilize photography equipment to collecting image or video data, and then process them for further analysis. Processing some types of data for people is very tedious and time-consuming. As the growth of the data, research progress could be slowing down. In this thesis, the situation we faced is that the biology researchers use the unmanned aerial vehicle (UAV) to take aerial video along the seaside of Dongsha islands, and they need to recognize the location of stingrays and counting the number of them in those videos. Since using traditional detection methods are difficult to detect stingray, we attempt develop a deep learning-based method for automatic stingray detection. It uses region-based convolution neural network (CNN) as the basic model for frame-wise detection. To increase the capability, the temporal information, such as the predictability of moving trajectory, and the consistency of object’s location on the time axis, is further integrated into the model. The goal of this study is to develop a system that can automatically handle the detection tasks for aerial videos. We hope that the achievement could help ecological researchers save time in processing video data.
論文審定書 ⅰ
摘要 ⅱ
ABSTRACT ⅲ
目錄 ⅳ
圖目錄 ⅵ
表目錄 ⅷ
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 論文架構 3
第二章 相關研究與文獻回顧 4
2.1 基於規則的方法 4
2.2 機器學習方法 5
2.3 深度學習方法 7
2.4 深度學習相關文獻回顧 8
第三章 基於區域卷積網路的物件偵測方法 12
3.1 類神經網路 12
3.1.1 類神經網路簡介 12
3.1.2 類神經網路運作程序 14
3.1.3 誤差反向傳播演算法: 15
3.1.4 學習率和慣性項 20
3.1.5 小批量梯度下降法 21
3.2 基於區域的卷積神經網路:Faster R-CNN 22
3.2.1 神經層介紹 23
3.2.2 多層卷積網路 27
3.2.3 區域提案網路 31
3.2.4 區域識別網路 35
第四章 連續影像之魟魚偵測方法 38
4.1 使用Faster R-CNN進行影片偵測 38
4.2 導入時間資訊的Faster R-CNN進行影片偵測 40
4.2.1 移動軌跡資訊 40
4.2.2 前後影格偵測結果一致性 44
4.3 物件偵測評估方法 45
第五章 實驗資料與實驗結果 48
5.1 魟魚影像資料介紹 48
5.1.1 實驗資料 48
5.1.2 資料擴增 51
5.2 實驗設定 52
5.2.1 電腦硬體與作業平台 52
5.2.2 參數設定 52
5.3 Baseline方法簡介 53
5.3.1 Selective Search 53
5.3.2 方向梯度直方圖 54
5.3.2 支持向量機 54
5.4 網路模型訓練結果 57
5.5 實驗結果與分析 59
第六章 結論與未來展望 92
參考文獻 93
[1]J. Canny, “A Computational Approach to Edge Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp. 679–698, Nov. 1986.
[2]R. O. Duda and P. E. Hart, “Use of the Hough Transformation to Detect Lines and Curves in Pictures,” Commun. ACM, vol. 15, no. 1, pp. 11–15, Jan. 1972.
[3]N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 1, pp. 886–893 vol. 1.
[4]D. G. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 2, pp. 1150–1157 vol.2.
[5] “Robust real-time face recognition,” in 2013 Africon, 2013, pp. 1–5.
[6]C. H. Lampert, M. B. Blaschko, and T. Hofmann, “Beyond sliding windows: Object localization by efficient subwindow search,” in 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.
[7]J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders, “Selective Search for Object Recognition,” International Journal of Computer Vision, vol. 104, 2013.
[8]P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient Graph-Based Image Segmentation,” Int. J. Comput. Vision, vol. 59, no. 2, pp. 167–181, Sep. 2004.
[9]A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097–1105.
[10]Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[11]J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
[12]M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolutional Networks,” arXiv:1311.2901 [cs], Nov. 2013.
[13]R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-Based Convolutional Networks for Accurate Object Detection and Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142–158, Jan. 2016.

[14]K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv:1409.1556 [cs], Sep. 2014.
[15]R. Girshick, “Fast R-CNN,” in 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440–1448.
[16]S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PP, no. 99, pp. 1–1, 2016.
[17]W. Han et al., “Seq-NMS for Video Object Detection,” arXiv:1602.08465 [cs], Feb. 2016.
[18]K. Kang, W. Ouyang, H. Li, and X. Wang, “Object Detection from Video Tubelets with Convolutional Neural Networks,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 817–825.
[19]L. Wang, W. Ouyang, X. Wang, and H. Lu, “Visual Tracking with Fully Convolutional Networks,” in 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 3119–3127.
[20]S. Haykin, Neural Networks: A Comprehensive Foundation (3rd Edition). Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2007.
[21] http://book.paddlepaddle.org/02.recognize_digits/
[22]M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman, The PASCAL Visual Object Classes (VOC) challenge. 2009.
[23]Y. Jia et al., “Caffe: Convolutional Architecture for Fast Feature Embedding,” in Proceedings of the 22Nd ACM International Conference on Multimedia, New York, NY, USA, 2014, pp. 675–678.
[24]R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “LIBLINEAR: A Library for Large Linear Classification,” J. Mach. Learn. Res., vol. 9, pp. 1871–1874, Jun. 2008.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔