(3.231.29.122) 您好!臺灣時間:2021/02/25 23:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:蕭寧諄
研究生(外文):Ning-Chun Hsiao
論文名稱:階層卷積神經網路的人臉偵測與辨識
論文名稱(外文):Face detection and recognition based on a cascaded convolutional neural network
指導教授:曾定章曾定章引用關係
指導教授(外文):Din-Chang Tseng
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:中文
論文頁數:71
中文關鍵詞:深度學習卷積神經網路人臉辨識人臉偵測
相關次數:
  • 被引用被引用:2
  • 點閱點閱:660
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:223
  • 收藏至我的研究室書目清單書目收藏:0
近年來,在卷積神經網路 (convolutional neural network, CNN) 的發展帶動下,人臉偵測 (face detection) 和人臉辨識 (face recognition) 取得很大的進步;多種獨特且新穎的神經網路架構被提出以解決各種人臉偵測與辨識的問題。不同應用需要不同的架構,像是過海關只需要確認人臉;但在監視或門禁系統上,則大都需要在大畫面中先偵測人臉,再辨識人臉。
我們提出了一個結合人臉偵測和人臉辨識的卷積神經網路架構,人臉偵測是使用類似 Faster R-CNN 中的 RPN 架構先提出可能是人臉的候選區域,再透過一個從粗到細 (coarse-to-fine) 的階層卷積神經網路 (cascaded CNN) 來確認這些候選區域是否真的是人臉。我們使用 RPN 架構取代原本滑動窗口的提出候選區域方法,避免每一個位置、每一種大小都一一去試而花費太多時間,改用 RPN 後每張 1920x1080 影像的偵測只需 0.08 秒,相較於改動前的 0.18 秒,速度有明顯提升,而偵測效果則保持和改動前差不多。
完成人臉偵測後,我們接著使用 FaceNet 來提取辨識用的特徵。因為損失函數的定義方式,兩張臉分別得到的特徵間之距離就可以直接反應兩張臉的相似度;也就是說,我們可以只透過求特徵之間的距離就完成分類,不需要額外再使用複雜的分類器,這也使我們的系統就算更換辨識目標,也不用重新訓練網路參數。而我們的網路辨識準確率達到 97%,雖然相比需要重新訓練的網路準確率稍微低了一點,但是考慮到不用重新訓練的方便性,我們認為得到的好處明顯大於損失的準確率。
In recent years, thanks to the development of CNN (convolutional neural network), researchers have made great progress on face detection and face recognition. Many unique and novel network structures have been proposed to solve different face detection or recognition problems. To use which network structure depends on the application, for example, we only need to perform face recognition on an image with only one face at customs. However, in monitoring or access control system, we need to perform face detection first to find where faces are and then recognize every faces.
We propose a CNN structure which combines face detection and face recognition. We use the RPN structure from Faster R-CNN to propose candidate regions which may be faces. We then use a coarse-to-fine cascaded CNN to check each candidate regions and filter out the regions which are not faces. By using RPN structure instead of using sliding widow to propose candidate region, we can avoid checking regions in every sizes and at every places one by one. The system needs only 0.08 seconds with RPN structure, compared to 0.18 seconds with sliding window method, we get better execution speed, and the detection capability remains nearly the same.
After finishing face detection, we then use FaceNet to extract features for recognition. Due to the definition of the loss function, the distance between two feature vectors extracted from two facial images can reflect the similarity of the two facial images. That is, we can recognize faces by only calculate the distance between feature vectors without using any complex classifiers, which allows us to use the same recognition system in different situations. The recognition accuracy of the proposed method can reach 97%, which is slightly lower than the methods that need to be retrained. However, considering the convenience of using the same recognition system without retraining, we think it’s still a great deal.
目錄

摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
第一章 緒論 1
1.1 研究動機 1
1.2 系統架構 2
1.3 論文架構 7
第二章 相關研究 8
2.1 人臉偵測 8
2.2 人臉辨識 14
第三章 人臉偵測 18
3.1 提出候選區域 18
3.2 確認候選區域是否為人臉 25
第四章 人臉辨識 33
4.1 特徵擷取 33
4.2 分類人臉 42
第五章 實驗 46
5.1 實驗設備與環境 46
5.2 實驗結果展示 46
5.3 人臉偵測實驗與結果 49
5.4 人臉辨識實驗與結果 52
第六章 結論 56
參考文獻 57
[1] M. Yang, D. Kriegman, and N. Ahuja, "Detecting faces in images: A survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, no.1, pp.34-58, 2002.
[2] G. Yang and T. Huang, "Human face detection in a complex background, "Pattern Recognition, vol.27, no.1, pp.53-63, 1994.
[3] C. Han, H. Liao, K. Yu, and L. Chen, "Fast face detection via morphology-based pre-processing," Pattern Recognition, vol.33, no.10, pp.1701-1712, 2000.
[4] D. Chai and K. Ngan, "Locating facial region of a head-and-shoulders color image," in Proc. Third IEEE Int. Conf. on Automatic Face and Gesture Recognition, Nara, Japan, Apr.14-16, 1998, pp.124-129.
[5] M. Augusteijn and T. Skufca, "Identification of human faces through texture-based feature recognition and neural network technology," in Proc. IEEE Int. Conf. on Neural Networks, San Francisco, CA, Mar.28-Apr.1, 1993, pp. 392-398.
[6] T. Sakai, M. Nagao, and S. Fujibayashi, "Line extraction and pattern detection in a photograph," Pattern Recognition, vol.1, no.3, pp.233-248, 1969.
[7] P. Viola and M. Jones, "Robust real-time face detection," Int. Journal of Computer Vision, vol.57, no.2, pp.137-154, 2004.
[8] F. Crow, "Summed-area tables for texture mapping," ACM SIGGRAPH Computer Graphics, vol.18, no.3, pp.207-212, 1984.
[9] Y. Freund and R. Schapire, "A desicion-theoretic generalization of on-line learning and an application to boosting," Computer and System Sciences, vol.55, no.1, pp.119-139, 1995.
[10] H. Li, Z Lin, X. Shen, J. Brandt, and G. Hua, ''A convolutional neural network cascade for face detection,'' in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, Jun.7-12, 2015, pp. 5325-5334.
[11] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint face detection and alignment using multitask cascaded convolutional networks," IEEE Signal Processing Letters, vol.23, no.10, pp.1499-1503, 2016.
[12] M. Turk and A. Pentland, "Eigenfaces for recognition," Journal of Cognitive Neurosicence, vol. 3, no.1, pp.71-86, 1991.
[13] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol.20, no.3, pp.273-297, 1995.
[14] T. Ahonen, A. Hadid, and M. Pietikäinen, "Face recognition with local binary patterns," in Proc. European Conf. on Computer Vision (ECCV), Prague, Czech Republic, May 11-14, 2004, vol.3021, pp.469-481.
[15] K. Grauman and T. Darrell, "The pyramid match kernel: Discriminative classification with sets of image features," in Proc. IEEE Conf. on Computer Vision, Beijing, China, Oct.17-21, 2005, vol.2, pp.1458-1465.
[16] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, " Deepface: Closing the gap to human-level performance in face verification," in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, Jun.23-28, 2014, pp.1701-1708.
[17] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, Jun.7-12, 2015, pp.815-823.
[18] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards real-time object detection with region proposal networks," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.39, Is.6, pp.1137-1149, 2016.
[19] R. Girshick, "Fast R-CNN," in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.1440-1448.
[20] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," in Proc. Int. Conf. on Learning Representations (ICLR), San Diego, CA, May 7-9, 2015.
[21] M. Zeiler and R. Fergus, "Visualizing and understanding convolutional neural networks," in Proc. European Conf. on Computer Vision (ECCV), Zurich, Switzerland, Sep.6-12, 2014, pp.818-833.
[22] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.770-778.
[23] C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Proc. of The Thirty-First AAAI Conf. on Artificial Intelligence, San Francisco, CA, Feb.4-9, 2017, pp.4278-4284.
[24] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
[25] Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencies with gradient descent is difficult," IEEE Trans. on Neural Networks, vol.5, Is.2, pp.157-166, 1994.
[26] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, "Tensorflow: a system for large-scale machine learning," in Proc. USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, GA, Nov.2-4,2016, pp.265-283.
[27] S. Yang, P. Luo, C. Loy, and X. Tang, "Wider face: A face detection benchmark," in Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.5525-5533.
[28] Y. Guo, L. Zhang, Y. Hu, X. He, J. Gao, "Ms-celeb-1m: A dataset and benchmark for large-scale face recognition," in Proc. European Conf. on Computer Vision (ECCV), Amsterdam, Netherlands, Oct.11-14, 2016, pp.87-102.
[29] D. Yi, Z. Lei, S. Liao, and S. Li, "Learning face representation from scratch," arXiv preprint arXiv:1411.7923, 2014.
[30] G. Huang, M. Mattar, T. Berg, E. Learned-Miller, "Labeled faces in the wild: A database forstudying face recognition in unconstrained environments," Technical Report 07-49, Dept. of Computer Science, University of Massachusetts, Amherst, MA, 2007.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔