臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.50) 您好！臺灣時間：2026/03/16 00:24

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

林民麒

研究生(外文):

Min-Chi Lin

論文名稱:

應用於智慧型自走載具之基於YOLO深度學習網路的人群人臉方向辨識技術設計與實現

論文名稱(外文):

Design and Implementation of YOLO Deep Learning Network Based Crowd Facial Direction Detection Technology for Intelligent Self-propelled Vehicles

指導教授:

范志鵬

指導教授(外文):

Chih-Peng Fan

口試委員:

黃穎聰、賴信志

口試委員(外文):

Yin-Tsung Hwang、Shin-Chi Lai

口試日期:

2019-07-18

學位類別:

碩士

校院名稱:

國立中興大學

系所名稱:

電機工程學系所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2019

畢業學年度:

107

語文別:

中文

論文頁數:

中文關鍵詞:

深度學習、物件偵測、YOLO、人臉方向辨識

外文關鍵詞:

deep learning、object detection、YOLO、face direction identification

相關次數:

被引用:4
點閱:884
評分:
下載:67
書目收藏:0

AI智能熱潮的興起，使得利用深度學習訓練模型來辨識物體並加以分類的情況越來越普遍，而硬體方面的提升也讓AI與嵌入式平台的結合越來越多，智慧型自走載具就是其中一項熱門應用。本系統能應用在智慧型自走載具上，將鏡頭拍攝到的影像即時分析，辨識出前方的行人和其面部方向。

本論文為了實現人臉方向辨識之深度學學模型，從三個公開資料庫中篩選出行人的圖片並對人臉進行標註，且為了增強鏡像類別的訓練，對其做水平翻轉以擴充資料庫。在模型架構方面，採用darknet深度學習架構，並以YOLOv2和tiny YOLOv3作為模型架構並進行延伸與比較，模型以RGB三通道圖片作為輸入，透過大量的卷積與池化層來擷取特徵，再將特徵圖進行計算後得到標註框與分類的預測。

實驗結果表示將類別分類換成logistic分類器後，成功的改善了鏡像類別誤判的問題，而在相同參數設置下，YOLOv2_logistic在webcam測試集上有最高的recall、precision與mAP，分別為85%、81%、86.28%，皆超過了80%;而略次之的則是tiny YOLOv3 (two scale)，其recall、precision、mAP為81%、80%、78.83%，這是因為YOLOv2_logistic的卷積層數量多於tiny YOLOv3導致的。但也因為卷積層較多，在fps的測試中，表現最好的是tiny YOLOv3類型，在Xavier上皆能達到30，TX2上都有20以上的成績;而YOLOv2類型在Xavier上只有20，在TX2上甚至只有約7.5。另外在多尺度檢測方面，使用越多尺度進行檢測可以使loss值的收斂速度更快。

整體表現最好的模型架構是tiny YOLOv3 (two scale)，雖各項結果稍微低於YOLOv2_logistic，但fps的結果卻大幅領先，非常適合嵌入式平台使用，且還能維持相當高的準確度，未來非常有機會應用在智慧自走載具上。

The rise of Artificial Intelligence makes it more and more common to use the deep learning training model to identify and classify objects, and the improvement of hardware also makes the combination of AI and embedded platforms more and more. Intelligent self-propelled vehicles are one of the popular applications. Our system can be applied to intelligent self-propelled vehicles to instantly analyze the images captured by the camera to identify the pedestrians in front and their facial directions.
In order to realize the deep learning model of face direction identification, our thesis selects the pictures of pedestrians from three public databases and marks the faces, and in order to enhance the training of mirror categories, horizontally flip them to expand the database. In terms of model architecture, the darknet deep learning architecture is adopted, and YOLOv2 and tiny YOLOv3 are used as the model architecture for extension and comparison. The model takes RGB three-channel image as input, extracts features through a large number of convolution and pooling layers. Then, the feature map will be calculated to obtain the prediction of the bounding box and the classification.
The experimental results show that after class classification is replaced by logistic classifier, the problem of false judgment of mirror class is successfully improved. Under the same parameter setting, YOLOv2_logistic has the highest recall, precision and mAP on the webcam test set, respectively, which is 85%. 81%, 86.28%, all exceeded 80%; and the second is tiny YOLOv3 (two scale), its recall, precision, mAP is 81%, 80%, 78.83%, because the number of convolution layers of YOLOv2_logistic More than tiny YOLOv3 caused. But also because of the large number of convolutional layers, in the fps test, the best performance is the tiny YOLOv3 type, which can reach 30 on Xavier and more than 20 on TX2; while the YOLOv2 type has only 20 on Xavier. Even on TX2 is only about 7.5.In addition, in multi-scale detection, the more scales are used for detection, the faster the convergence of the loss value can be achieved.
The best overall model architecture is tiny YOLOv3 (two scale). Although the results are slightly lower than YOLOv2_logistic, the results of fps are far ahead, which is very suitable for embedded platforms, and can maintain a high degree of accuracy. This system has a great opportunity to be applied to smart self-propelled vehicles in the future.

致謝 i
論文摘要 ii
Abstract iii
目錄 v
圖目錄 vii
表目錄 ix
1. 緒論 1
1.1. 研究動機與目的 1
1.2. 人臉方向辨識的挑戰 2
1.3. 論文架構 3
2. 文獻探討 4
2.1. 色彩空間 4
2.2. 圖片濾波及特徵空間 5
2.3. 各種物件偵測之深度學習模型 6
2.4. 基於區域的卷積神經網路R-CNN ( Region with CNN ) 7
3. 預備知識 9
3.1. 人工智慧AI ( Artificial Intelligence ) 9
3.2. 機器學習 ( Machine Learning ) 9
3.3. 深度學習 ( Deep Learning ) 9
3.4. 神經網路 ( Neural Network ) 10
3.5. 反向傳播BP ( Backpropagation ) 11
3.6. 卷積神經網路 ( Convolutional Neural Network ) 12
3.7. 卷積層 ( Convolution Layer ) 14
3.8. Zero Padding 14
3.9. 池化層 ( Pooling Layer ) 15
3.10. 激活函數 ( Activation Function ) 16
3.11. 全連接層 ( Fully Connected Layer ) 18
3.12. YOLO 物件偵測架構介紹 ( You Only Look Once ) 18
3.13. YOLOv2 第二代模型架構介紹 20
3.14. 批量標準化 ( BN，Batch Normalization ) [12] 22
3.15. 先驗框 ( Anchor Box ) [12] 22
3.16. 多尺度訓練 ( Multi-Scale Training ) [12] 24
3.17. tiny YOLOv3 模型架構介紹 24
4. 演算法實作流程 27
4.1. 資料蒐集與前處理 27
4.1.1. 資料集 ( dataset ) 27
4.1.2. 人臉方向定義 ( definition of face direction ) 28
4.1.3. 標註 ( label ) 28
4.1.4. 標註檔前處理 ( annotation preprocessing ) 30
4.1.5. 資料擴充—水平翻轉 ( Flip ) 31
4.1.6. 資料集分割 ( dataset segmentation ) 32
4.1.7. 測試用網路攝影機圖片資料集 ( webcam picture dataset for testing ) 32
4.2. 建立模型架構檔 33
4.3. 模型訓練流程 34
4.4. FD model演算法流程 35
4.5. 不同層數、架構、類別分類器對模型的影響 40
4.6. 模型效能評估 44
4.6.1. 召回率&精確率 ( Recall and Precision ) 44
4.6.2. mAP ( mean Average Precision ) 45
5. 實驗結果與分析 46
5.1. 測試集分析 46
5.2. YOLOv2模型測試結果 46
5.3. tiny YOLOv3模型測試結果與比較 49
5.4. tiny YOLOv3多尺度檢測之測試結果與比較 51
5.5. YOLOv2_logistic之測試結果與比較 53
5.6. 所有模型之fps比較 55
6. 結論與未來展望 56
6.1. 結論 56
6.2. 未來展望 56
參考文獻 57

[1]“AI, ML, Deep Learning關係” [Online]. Available: https://nccuclement.wordpress.com/2017/07/11/ai-ml-deep-learning/
[2]“機器學習-神經網路(多層感知機 Multilayer perceptron, MLP) 含倒傳遞( Backward propagation)詳細推導”
[Online]. Available: https://reurl.cc/plLDe
[3]“[資料分析&機器學習] 第5.1講: 卷積神經網絡介紹(Convolutional Neural Network)” [Online]. Available: https://reurl.cc/q7kdR
[4]“Sigmoid、Tanh、ReLu函數圖” [Online].
Available:https://www.julyedu.com/question/big/kp_id/26/ques_id/1044
[5]“關於影像辨識，所有你應該知道的深度學習模型”
[Online].Available:https://medium.com/@syshen/%E7%89%A9%E9%AB%94%E5%81%B5%E6%B8%AC-object-detection-740096ec4540
[6]“自走式倉儲載具” [Online]. Available: http://www.digitimes.com.tw/iot/article.asp?cat=158&cat1=20&cat2=&id=0000524789_F0Q8OAXR7J5IWRLUJWTOA&social_share=y
[7]“HSL和HSV色彩空間” [Online]. Available:
https://zh.wikipedia.org/wiki/HSL%E5%92%8CHSV%E8%89%B2%E5%BD%A9%E7%A9%BA%E9%97%B4
[8]“方向梯度直方圖” [Online]. Available:
https://zh.wikipedia.org/wiki/%E6%96%B9%E5%90%91%E6%A2%AF%E5%BA%A6%E7%9B%B4%E6%96%B9%E5%9B%BE
[9]“特徵金字塔網路FPN” [Online]. Available:http://wiszhipu.com/?id=113
[10]“多尺度檢測” [Online]. Available:
https://bigdatafinance.tw/index.php/tech/coding/639-pytorch-yolo-v3
[11]“ROC曲線跟PR曲線” [Online]. Available:
https://www.cnblogs.com/Allen-rg/p/5821949.html
[12]Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271).
[13]Dalal, N. (2006). Finding people in images and videos (Doctoral dissertation, Institut National Polytechnique de Grenoble-INPG).
[14]Okubo, J., Sugandi, B., Kim, H., Tan, J. K., & Ishikawa, S. (2008, October). Face direction estimation based on eigenspace technique. In 2008 International Conference on Control, Automation and Systems (pp. 1264-1267). IEEE.
[15]WANG, Liming, et al. Object detection combining recognition and segmentation. In: Asian conference on computer vision. Springer, Berlin, Heidelberg, 2007. p. 189-199.
[16]DALAL, Navneet; TRIGGS, Bill. Histograms of oriented gradients for human detection. In: international Conference on computer vision & Pattern Recognition (CVPR'05). IEEE Computer Society, 2005. p. 886--893.
[17]Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).
[18]Uijlings, Jasper RR, et al. "Selective search for object recognition." International journal of computer vision 104.2 (2013): 154-171.
[19]ABE, Shinji; MORIMOTO, Masakazu; FUJII, Kensaku. Estimating face direction from wideview surveillance camera. In: 2010 World Automation Congress. IEEE, 2010. p. 1-6.
[20]“The PASCAL Visual Object Classes Homepage” [Online]. Available:http://host.robots.ox.ac.uk/pascal/VOC/
[21]“INRIA Person Dataset” [Online]. Available:http://pascal.inrialpes.fr/data/human/
[22]“Penn-Fudan Database for Pedestrian Detection and Segmentation” [Online]. Available: https://www.cis.upenn.edu/~jshi/ped_html/
[23]“github:darknet” [Online]. Available:https://github.com/pjreddie/darknet
[24]“feature visualiztion” [Online]. Available:
https://github.com/jing-vision/lightnet/tree/master/feature-viz
[25]“LabelImg” [Online]. Available: https://github.com/tzutalin/labelImg

電子全文

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	基於深度學習網路架構之物件偵測運算加速
2.	以基於區域的卷積神經網路實現空拍影片之魟魚偵測和辨識
3.	結合CNN與LSTM於即時異常行為檢測
4.	基於深度學習之即時火災偵測
5.	基於深度學習進行俯視角度的行人偵測
6.	RodNet：基於深度學習之夜間情境物件偵測技術
7.	整合人臉辨識與物件偵測技術實現有限範圍之服飾特徵身分識別系統
8.	基於深度學習之汽車車門突開即時偵測系統
9.	深度學習影像辨識技術於交通標誌辨識之應用
10.	基於邊緣運算進行深度學習物件偵測之效能評估
11.	利用特徵分享加速自駕車場景解析
12.	物件偵測在Android手機裝置上之應用
13.	Bigger is Not Better: Towards Faster Multi-Scale Object Detectors
14.	應用影像辨識提升管件清點正確率之研究-以XX塑膠公司為例
15.	基於焦點移轉的方式理解指稱表達式中的物件關係

無相關期刊

1.	應用於智慧型自走載具之基於YOLO深度學習網路的行人防撞與行進方向偵測技術設計與實現
2.	基於YOLO物體偵測進行即時的實例分割
3.	使用YOLO演算法之水果品質分類系統實作
4.	應用於行車安全之基於深度學習的駕駛抽菸行為偵測
5.	一個基於YOLO模型進行即時物件辨識之研究
6.	以YOLO深度學習模型應用於偵測危險物品標誌車輛
7.	基於深度學習之瞳孔追蹤技術設計實現與其在可見光穿戴式眼動儀的應用
8.	訓練資料集自我擴增問題之分析-以YOLO分類器為例
9.	基於YOLO之低複雜度深度學習網路且結合虹膜鞏膜資訊的生物識別授權技術
10.	YOLO神經網路在人臉身分與性別辨認的AI前端應用研究
11.	基於YOLO模型的時尚圖像物件辨識研究
12.	基於YOLO之雜草辨識系統
13.	基於Cordic之Tiny Yolo V2物件區塊網路層的架構設計與實現
14.	探討YOLO神經網路的效能與應用
15.	基於YOLO之前車車距估計系統

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室