跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/10 14:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:曾華逸
研究生(外文):Hua-Yi Tzeng
論文名稱:應用並行MMX延伸指令集實現人臉辨識演算法
論文名稱(外文):Implementation of face detection algorithm with parallel extended-MMX instruction set
指導教授:邱日清
指導教授(外文):Jih-Ching Chiu
學位類別:碩士
校院名稱:國立中山大學
系所名稱:電機工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:中文
論文頁數:103
中文關鍵詞:多資料流多媒體指令集人臉辨識並行處理
外文關鍵詞:multi-data streamingface detectionextended-MMX instruction setMMX
相關次數:
  • 被引用被引用:0
  • 點閱點閱:296
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在現今的人臉辨識的運用上相當廣泛,諸如門禁系統、攝像監視系統、網路應用、學生考勤系統…等等,考慮人臉辨識的準確度與資料有規則性排列選擇Recognition algorithms using neural network作為實現的演算法,實現演算法方式可分為三個部分:圖形Modified Census Transform (MCT)轉換、計算hypotheses與以方框標示人臉,第一部分, MCT轉換在計算的方式上固定以及資料來源連續的排列,適合用SIMD來執行運算,其他部分則因資料不規則散佈使其對應運算不易並行。本論文使用本實驗室所開發的多資料流SIMD架構處理器實現圖形MCT轉換,並運用其多資料流特性,將須處理之圖形分成四部份同時執行(Isolation mode),並搭配演算法之需求以切換作業模式,達成順暢的資料取存,降低了資料搬移的負擔,且新增指令以16bits資料格式將四個MMX暫存器作4×4矩陣轉置,和將載入資料的同時作有號數或無號數的位元擴充,以達到加速多資料流並行處理,同時為了因應多資料流來源並非都是連續,所以透過Striping的機制分別抓取固定間距的資料來源,以達到多部份同時執行的目的。此外在身份識別實現上利用被尋目標的hypotheses值與待測目標的hypotheses值做比較,兩者相差在千分之3以內視為同一個人。最後在UMVP-2500開發平台上實際執行驗證正確,在效能分析方面,透過對於MMX與XScale的比較,分析多資料流SIMD架構所帶來的效能提升,與MMX相比效能提升了373%,與XScale相比效能提升了345%,這顯示出多資料流SIMD架構對於平行運算的效能,比起現今其他架構可以有大幅度的提升。在固定模式標準化方面,則是因為多資料流SIMD架構與新增矩陣轉置的指令,其延伸的使用改變了資料行列的位置,進而在計算上有新的概念,從原本的一直線,變成一個面,這是在MMX與XScale都沒有的概念。
Face detection has many applications in technical area. We think about accuracy and regular arrangement of data of face detection. So, we select Recognition algorithms using neural network for implementation. The implementation method can be divided into three parts. One is Modified Census Transform. The other one is computing hypotheses. Other is square frame for mark face. Modified Census Transform is a regularly computing method and regular arrangement of data. Modified Census Transform is compatible using SIMD execution, but other parts is irregular arrangement of data and not easy to parallel execution. This paper uses SIMD processor architecture which develops in our laboratory to implementation of Modified Census Transform and multi-data streaming property. The picture is divided four parts to execute at the same time and changes different mode to execute according to different algorithm then fetch data is smooth and moving data can reduce frequency. Adding a new instruction that uses 16bits data format uses four MMX registers for 4×4 transpose of the matrix. The other is loading data and extending signed bit or unsigned bit at the same time. They can accelerate parallel execution in multi-data streaming. We also support multi-data streaming that is not series. It uses striping mode to fetch multi-data which between the same distance then we can achieve to compute multi-data streaming. Besides, we use hypotheses to distinguish different person that we only want find one. We compare two hypotheses. If the difference in hypotheses between two different picture that there is small than 0.3%, they are the same person which in different picture. Finial, we verify the function is correct in UMVP-2500 platform. We compare efficiency with MMX and Xscale and analysis multi-data streaming SIMD architecture which has some benefits. We compare efficiency with MMX. We speed up 373%. We compare efficiency with Xscale. We speed up 345%. This result will show that multi-data streaming SIMD architecture compares speed up with others SIMD architecture. Multi-data streaming SIMD architecture adds a new instruction which is 4×4 transpose of the matrix. Because the 4×4 transpose of the matrix can change row and column, we have new abstraction. The common computation likes a line, but the new abstraction becomes a phase. MMX and Xscale are not this abstraction.
摘要 I
目錄 V
圖目錄 VII
表目錄 XII
第一章 簡介 1
1-1研究動機 1
1-2研究目的 1
第二章 相關研究 3
2-1 Adaptive Boosting 3
2-2 face detection流程及其運算 4
2-3 BMMOC (Basic Multiple Multi-media Operation Cell)之研究 8
第三章 20
人臉辨識的實現與實現方法模式化 20
3-1 實現人臉辨識流程 20
3-1-1求8點平均值 20
3-1-2計算MCT編號 21
3-1-3標記人臉位置 23
3-1-4依照標記位置畫出方框 24
3-2分析指令平行 24
3-2-1求8點平均值 25
3-2-2計算MCT編號 29
3-2-3標記人臉位置 31
3-2-4依照標記位置畫出方框 33
3-3運算方式標準化 35
3-3-1有重疊的矩陣和 36
3-3-2沒有重疊的矩陣和 39
第四章 硬體改進 44
4-1增加暫存器 44
4-2改進XSTR4 45
4-3修改SAR與增加SCR 45
4-4 Index address stream mode 47
4-5 Data prefetch 48
第五章 實現平台的建立 58
5-1模擬平台 58
5-2使用UMVP-2500實做 63
第六章 效能評估 67
6-1計算8點平均 67
6-2計算MCT編號 72
6-3計算標記人臉 77
6-4計算8點平均加上Data pre-fetch 80
6-5計算MCT編號加上Data prefetch 81
6-6指令所佔比例之分析 83
第七章 結論 85
參考文獻 86
[1]R. E. Schapire and Y. Singer, “Improved Boosting Algorithms Using Confidence-rated Predictions,” Machine Learning, vol. 37, no. 3, pp. 297-336, 1999.
[2]Yoav Freund and Robert E. Shapire, “A Short Introduction to Boosting”, In Journal of Japanese Society for Artificial Intelligence.
[3]Derek Hoiem , ”Adaboost” , http://www.cs.uiuc.edu/homes/dhoiem/presentations/Adaboost_Tutorial.ppt
[4]Hongbo Deng, “A Brief Introduction to Adaboost” , http://www.cse.cuhk.edu.hk/~lyu/seminar/07spring/Hongbo.ppt
[5]B. Roe, “Boosted Decision Trees, An Alternative to Artificial Neural Networks” , http://www-mhp.physics.lsa.umich.edu/~roe/boosting_talk_05.ppt
[6]Guy Leshem, “Improvement of Adaboost Algorithm by Using Random Forests as Weak Learner and Using This Algorithm as Statistics Machine Learning for Traffic Flow Prediction” , January 25, 2005.
[7]Bernhard Froba and Andreas Ernst, “Face Detection with The Modified Census Transform”, Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1-6, 2004.
[8]Y. Wei, X. Bing, and C. Chareonsak, “FPGA Implementation of Adaboost Algorithm for Detection of Face Biometrics,” In Int. Workshop on Biomedical Circuits & Systems 2004
[9]Henry A. Rowley, Shumeet Baluja, and Takeo Kanade, “Neural Network-Based Face Detection”, PAMI, January 1998
[10]John Iselin Woodfill, Gaile Gordon, Ron Buck, “Tyzx DeepSea High Speed Stereo Vision System”, Tyzx, Inc.
[11]Jih-Ching Chiu, Shou-Xi Hong and Kai-Ming Yang, "Register Processor for MMX Instructions, " The 18th VLSI Design/CAD Symposium, pp. 696-700, Aug. 2007.
[12]Intel Corp. , “Intel XScale® Technology: Intel® Wireless MMX™ 2 Coprocessor” , Order Number 314510-001. July, 2006.
[13]Millind Mittal, Alex Peleg and Uri Weiser, “MMX™ Technology Architecture Overview“, Intel Technology Journal, vol. 1 issue 1, 3rd quarter 1997.
[14]Intel MMX Instruction Set, http://softpixel.com/~cwright/programming/simd/mmx.php
[15]Stefano Tommesani, Intel MMX Instruction Set, http://www.tommesani.com/MMXPrimer.html.
[16]AMD 3DNow! Instruction Set, http://softpixel.com/~cwright/programming/simd/3dn.php.
[17]Youngsik Kim, Tack-Don Han, Shin-Dug Kim and Sung-Bong Yang, “An Effective Memory--Processor Integrated Architecture for Computer Vision”, International Conference on Parallel Processing (ICPP ''97), p. 266, 1997.
[18]David M. Koppelman, “A Multiprocessor Memory Processor for Efficient Sharing And Access Coordination”, Workshop on Mixing Logic and DRAM, International Symposium on Computer Architecture, June 1997.
[19]PPM / PGM / PBM Image Files, http://local.wasp.uwa.edu.au/~pbourke/dataformats/ppm/
[20]ARM Inc., “AMBA Specification (Rev 2.0),” available in http://www.arm.com/
[21]David Flynn, “AMBA: Enabling Reusable On-chip Designs,” IEEE Micro, pp. 20-27, 1997.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top