跳到主要內容

臺灣博碩士論文加值系統

(100.28.227.63) 您好!臺灣時間:2024/06/16 19:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:廖宜健
研究生(外文):LIAO, YI-CHIEN
論文名稱:使用YOLOv3-Reduce深度學習網路實現即時行人偵測系統
論文名稱(外文):The Real-Time Pedestrian Detection with YOLOv3-Reduce
指導教授:張陽郎張陽郎引用關係
指導教授(外文):CHANG, YANG-LANG
口試委員:余憲政林敏青張麗娜馬尚智張陽郎
口試委員(外文):YU, XIAN-ZHENGLIN, MIN-QINGCHANG, LENAMA, SHANG-CHIHCHANG, YANG-LANG
口試日期:2019-07-17
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電機工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:62
中文關鍵詞:電腦視覺行人偵測Faster R-CNNSSDYOLOv2YOLOv3
外文關鍵詞:Computer VisionPedestrian DetectionFaster R-CNNSSDYOLOv2YOLOv3
相關次數:
  • 被引用被引用:0
  • 點閱點閱:452
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來,隨著電腦視覺逐漸興起,其中的物件偵測領域更是有了許多以不同方法創造出的架構。其中則以深度學習的類神經網路架構為主流方向,本研究利用深度學習法執行物件偵測中的行人偵測,並結合Microsoft所提供的COCO(Common Objects in Context)資料庫進行訓練與評測。深度學習方法則將YOLOv3(You Only Look Once version3)架構改善後,對目標物件做偵測,針對不同場景的行人影像,不同幀數的行人測試影片來對其進行測試與評估,其偵測速度相較其他深度學習法如SSD(Single Shot Multi-Box Detector)相比較,偵測速度上升了5倍,平均精度則是從54%提升到66%,此架構將其命名為YOLOv3-Reduce,其先將網路中之卷積層與shortcut層減少20層,再把卷積核的數量減半,藉以提升速度,其偵測速度在影片解析度720p下提升約80%,將目前目標偵測領域中主流的深度學習網路針對行人偵測進行準確率與速度的優化。
In recent years, with the energetically of computer vision, there are many different ways to create architectures in the field of object detection. Among them, the deep learning neural network architecture is the mainstream. This study uses the deep learning method to perform pedestrian detection in object detection, and uses the COCO (Common Objects in Context) database provided by Microsoft for training and evaluation. The deep learning method optimizes the YOLOv3 (You Only Look Once version 3) architecture, detects the target object, and tests and evaluates the pedestrian image of different scenes for different scenes. Compared with other deep learning methods such as SSD (Single Shot Multi-Box Detector), the detection speed has increased by 5 times and the average accuracy has increased from 54% to 66%. This architecture named it YOLOv3-Reduce. By reducing the convolution kernel by half and decreasing 20 convolution layers and shortcut layers, YOLOv3-Reduce increases the detection speed by about 80% under the 720p resolution. The current deep learning network in the target detection field is optimized for the accuracy and speed of pedestrian detection.
摘要.................................................................................................................................i
英文摘要........................................................................................................................ii
致謝...............................................................................................................................iii
目錄...............................................................................................................................iv
表目錄...........................................................................................................................vi
圖目錄..........................................................................................................................vii
第1章............................................................................................................................1
1.1 目標偵測方法概觀.........................................................................................................2
1.2 研究動機與目的.............................................................................................................4
1.3 論文內容大綱.................................................................................................................5
第2章相關文獻回顧..................................................................................................6
2.1 卷積神經網絡.................................................................................................................6
2.1.1 卷積層(Convolution Layer)................................................................................7
2.1.2 池化層(Pooling Layer) .......................................................................................7
2.1.3 全連接層(Full Connected Layer) .......................................................................8
2.2 深度學習架構在物件偵測上的分支.............................................................................8
2.2.1 R-CNN與Fast R-CNN.............................................................................................9
2.2.2 Faster R-CNN..........................................................................................................12
2.2.3 SSD: Single Shot MultiBox Detector......................................................................15
2.2.4 YOLO......................................................................................................................18
2.2.4.1 YOLOv1 ...........................................................................................................18
2.2.4.2 YOLOv2 ...........................................................................................................22
2.2.4.3 YOLOv3 ...........................................................................................................32
第3章研究方法........................................................................................................37
3.1 COCO資料集................................................................................................................37
3.2 資料前處理...................................................................................................................39
3.2.1 資料集的類別提取與標註格式的轉換................................................................39
3.2.2 資料集的劃分........................................................................................................40
3.3 偵測方法.......................................................................................................................40
3.4 模型的評估方式...........................................................................................................45
3.4.1 準確率指標............................................................................................................45
3.4.2 速度指標................................................................................................................46
第4章實驗結果........................................................................................................48
4.1 比較物件偵測模型.......................................................................................................49
4.2 不同解析度與FPS的輸入與BFLOPs .......................................................................54
第5章結論與未來發展............................................................................................56
5.1 結論...............................................................................................................................56
5.2 未來發展.......................................................................................................................56
參考文獻......................................................................................................................58
[1]游舒伃,基於HOG、SVM及非極大值抑制邊緣偵測方法實現人體即時偵測於TX1 GPU嵌入式系統,碩士論文,國立台北科技大學電機工程系研究所,台北,2016。
[2]李宏德,使用YOLOv3-mobile深度學習網路於嵌入式GPU上實現即時行人偵測系統,碩士論文,國立台北科技大學電機工程系研究所,台北,2018。
[3]陳威霖,應用階層式影像切割技術於SAR影像船舶和油污偵測之研究,碩士論文,國立臺灣海洋大學通訊與導航工程學系研究所,台北,2018。
[4]Vipin Kumar Kukkala, Jordan Tunnell, Sudeep Pasricha and Thomas Bradley," Sensor for cleani Advanced Driver-Assistance Systems: A Path Toward Autonomous Vehicles," IEEE Consumer Electronics Magazine, Vol. 7, Issue 5, 2018, pp. 18-25.
[5]Rahul Raman, Pankaj Kumar Sa, Banshidhar Majhi and Sambit Bakshi," Direction Estimation for Pedestrian Monitoring System in Smart Cities: An HMM Based Approach," IEEE Access, Vol. 4, 2016, pp. 5788-5808.
[6]廖峻軒,利用巴斯卡轉換於彩色影像邊緣偵測之研究,碩士論文,玄奘大學資訊管理學系碩士班,新竹,2015。
[7]赵莉、白猛猛、雷松泽、计雪薇,「深度学习在车牌定位中的研究」,计算机应用研究,35卷,第10期,2018,第3142-3146頁。
[8]吳宗哲,基於移動攝影機攝取畫面之多移動目標偵測方法之研究,碩士論文,國立高雄應用科技大學電子工程系碩士班,高雄,2013。
[9]Ren, S., K. He, R. Girshick and J. Sun," Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, Issue 6, pp. 1137-1149, 2017.
[10]W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu and A. C. Berg," SSD: Single Shot MultiBox Detector," ArXiv e-prints, pp. 4-21, 2015.
[11]K. He, X. Zhang, S. Ren and J. Sun," Deep residual learning for image recognition," arXiv preprint, 2015.
[12]D. Lowe," Distinctive image features from scale-invariant keypoints," IJCV, vol. 60, no. 2, pp. 91–110, 2004.
[13]N. Dalal and B. Triggs," Histograms of oriented gradients for human detection," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 886893.
[14]P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan," Object detection with discriminatively trained part based models," PAMI, 2009.
[15]Girshick, R., J. Donahue, T. Darrell, and J. Malik," Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," IEEE International Conference on Computer Vision and Pattern Recognition, pp. 580-587, 2014.
[16]Girshick, R., "Fast R-CNN," IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448, December 2015.
[17]Redmon, J., S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, December 2016.
[18]Redmon, J. and A. Farhadi, "YOLO9000: Better, Faster, Stronger," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525, November 2017.
[19]Joseph Redmon, Ali Farhadi, "YOLOv3: An Incremental Improvement”, Computer Vision and Pattern Recognition,2018, arXiv:1804.02767.
[20]A. Krizhevsky, I. Sutskever, and G. Hinton," ImageNet classification with deep convolutional neural networks," In Proc. Advances in Neural Information Processing Systems vol. 25, pp. 1090–1098, Dec 2012.
[21]Uijlings, J.R.R., K.E.A. van de Sande, T. Gevers, and A.W.M. Smeulders, "Selective Search for Object Recognition," International Journal of Computer Vision, Vol.104, Issue 2, pp. 154-171, September 2013.
[22]Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: Proc. of ICPR, vol. 3, pp. 850–855 ,2006.
[23]Ren, S., K. He, R. Girshick, X. Zhang, and J. Sun, "Object Detection Networks on Convolutional Feature Maps," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, Issue 7, pp. 1476-1481, July 2017.
[24]T.Y .Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, S. Belongie. Feature Pyramid Networks for Object Detection. CVPR, 2017.
[25]T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft ´ coco: Common objects in context,” in European Conference on Computer Vision. Springer, 2014, pp. 740–755.
[26]deeplearning.net, “Convolutional neural networks, lenet,” Jun 2018. [online] http://deeplearning.net/tutorial/lenet.html.
[27]Hiroki Nakahara, Tomoya Fujii and Shimpei Sato," A fully connected layer elimination for a binarizec convolutional neural network on an FPGA," 2017 27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium, 04 September 2017, pp. 78-82.
[28]Min Lin, Qiang Chen and Shuicheng Yan," Network In Network," arXiv:1312.4400v3, 2014.
[29]Benjamin Wilson, Judy Hoffman and Jamie Morgenstern, " Predictive Inequity in Object Detection," arXiv:1902.11097v1, 2019
[30]Qiong WU and Sheng-bin LIAO, " Single Shot MultiBox Detector for Vehicles and Pedestrians Detection and Classification," APOP, Shanghai, China, 30 December 2017, pp. 22-28

電子全文 電子全文(網際網路公開日期:20240726)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊