跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.88) 您好!臺灣時間:2026/02/15 01:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林彥宇
研究生(外文):Yen-Yu Lin
論文名稱:基於學習理論之快速偵測
論文名稱(外文):Fast Object Detection via Learning
指導教授:劉庭祿傅楸善傅楸善引用關係
指導教授(外文):Tyng-Luh LiuChiou-Shann Fuh
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:英文
論文頁數:78
中文關鍵詞:學習人臉偵測
外文關鍵詞:learningboostingSVMface detection
相關次數:
  • 被引用被引用:0
  • 點閱點閱:618
  • 評分評分:
  • 下載下載:56
  • 收藏至我的研究室書目清單書目收藏:1
本篇論文旨在探討有關物件快速偵測的問題,我們提出一個基於學習理論(learning theory)的偵測方法,並將其運用在人臉偵測上。在學習方面,我們選擇以boosting演算法作為學習機制的基礎;在偵測模式方面,使用cascade流程作為快速偵測的架構。經由這二個主要元件的搭配,發展出一套有效率的偵測系統。
我們所發展出的演算法,是以Viola和Jones在CVPR 2001所發表的論文為出發點。除了著重於偵測速度的提升,我們更深入探討兩個在物件偵測研究上之難題overfitting及occlusion。本篇論文有三點主要貢獻:(1) 發展reinforced training的方法來改善偵測的成效;(2) 應用learning with soft-boosting的理論來消去overfitting的影響;(3) 結合cascade with evidence,賦予系統偵測被遮蔽人臉的能力。
為了驗證所提出的演算法之效率及準確度,我們作了數種測試。在效能及效率上,所建立出的系統能提供人臉及被遮蔽人臉之即時偵測,在Pentium-4 3.06GHz的PC上,每秒鐘能偵測25張320x240的影像;在準確度上,我們使用人臉偵測領域的標竿資料庫(CMU+MIT)進行試驗,偵測率為91%。
In this thesis, we address the problem of fast object detection. Our approach is based on learning and the application has been focused on the detection of faces. In implementing the learning scheme, we use boosting algorithms, and in structuring the detecting mechanism, we deploy a cascade framework. It turns out the two elements are well coupled and lead to an effective detection system.
Our approach is closely related to the work of Viola and Jones. However, while their work emphasizes mostly on the speed of detection, our approach goes further to deal with the difficult problems of overfitting and occlusion. In particular, the contributions of our work can be characterized by: (i) reinforced training, to improve the generality of cascade training; (ii) learning with soft-boosting, to alleviate the effects of overfitting; and (iii) cascade with evidence, to equip the system with the ability to handle occlusion, without compromising in efficiency. We have also included various experimental results to illustrate the advantages of the proposed method. The resulting system can efficiently detect faces (even with occlusion) in real-time.
1 Introduction.............................................1
1.1 Related Work . . . . . . . . ..... . . . . . . . . . 1
1.1.1 Knowledge-Based Methods . . . . . . . . . . . . 2
1.1.2 Feature Invariant Methods . . . . . . . . . . . 3
1.1.3 Template Matching Methods . . . .. . . . . . . . 3
1.1.4 Appearance-Based Methods . . . . . . . . . . . . 4
1.2 Our approach . . . . . . . . . . . . . . . . . . . . 5
2 Learning with Boosting...................................7
2.1 Background . . ... . . . . . . . . . . . . . . . . . 7
2.2 Adaboost . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Boosting versus SVM . . . . . . . . . . . . . . . . 11
2.3.1 SVM . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Why Use Boosting? . . . . . . . . . . . . . . . 15
2.4 The Problem of Overfitting . . .. . . . . . . . . . 17
2.5 Adaboost with Soft Margins . . . . . . . . . . . . 20
3 Fast Detection via Cascade..............................23
3.1 Introduction . . . . . . . . . . . . . . . . . . . 23
3.2 Viola and Jones’s Framework . . . . . . .... . . . 25
3.2.1 Rectangle Features . . . . . . . . . . . . . . 25
3.2.2 Integral Images . . . . . . . . . . . . . . . . 27
3.2.3 Learning Classification Functions . . . . . . . 28
3.2.4 Cascade of Face Detectors . . . . . . . . . . . 30
3.3 Possible Improvements . . . . . . . . . . . . . . . 34
3.4 Some Extensions . . . . . . . . . . . . . . . . . . 35
4 Our Framework...........................................37
4.1 Reinforced Training . . . . . . . . . . . . . . . . 37
4.1.1 Boot-Strap . . . .... . . . . . . . . . . . . . 38
4.1.2 Capabilities of Boot-Strap . . . . . .. . . . . 40
4.1.3 Mechanism of Reinforced Training . . . . . . . 41
4.2 Learning with Soft-Boosting . . . . . . . . . . . . 44
4.2.1 Weight Distribution on Weak Learners . . . . . 45
4.2.2 Performance of Strong Classifier . . . .. . . . 47
4.3 Cascade with Evidence . . . . . . . . . . . . . . . 48
4.3.1 An Idea for Detecting Occluded Faces . . . . . 48
4.3.2 Cascade with Evidence . . . . ..... . . . . . . 50
4.3.3 Performance Analysis . . . . . . . . . . . . . 54
5 Implementation and Experimental Results................ 57
5.1 Training Data . . . . . . . . . . . . . . . . . . . 57
5.2 Normalization of Lighting Conditions . . . . . . . 58
5.3 Implementation . . . . . . . . . . . . . . . . . . 59
5.3.1 Training Process . . . . . . . . . . . . . . . 59
5.3.2 Testing Process . . . . . . . . . . . . . . . . 60
5.4 Detection Results . . . . . . . . . . . . . . . . . 62
6 Conclusion and Discussion...............................70
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . 70
6.2 Discussion . . .. . . . . . . . . . . . . . . . . . 71
[1] B. E. Boser, I. M. Guyon, and V. Vapnik, “A Training
Algorithm for Optimal Margin Classifiers,” Proceedings of
ACM Workshop Computational Learning Theory, pp.144—152,
Pittsburgh, PA, USA, 1992.
[2] L. Breiman, “Arcing Classifiers,” Annals of Statistics,
pp. 801—849, 1998.
[3] L. Breiman, “Prediction Games and Arcing Algorithms,”
Neural Computation, vol.11, no. 7, pp. 1493—1518, 1999.
[4] M. Collins, R. E. Schapire, and Y. Singer, “Logistic
Regression, AdaBoost and Bregman Distances,”
Computational Learing Theory, pp. 158—169, 2000.
[5] T. M. Cover, “Geometrical and Statistical Properties of
Systems of Linear Inequalities with Applications in
Pattern Recognition,” IEEE Transactions on Electronic
Computers, vol. 14, pp. 326—334, 1965.
[6] M. Dettling and P. B¨uhlmann, “How to Use Boosting for
Tumor Classification with Gene Expression Data,” Tech.
Rep., Statistics Dept., ETH Z¨urich, 2002.
[7] T. G. Dietterich, “An Experimental Comparison of Three
Methods for Constructing Ensembles of Decision Trees:
Bagging, Boosting, and Randomization,” Machine Learning,
vol. 40, no. 2, pp. 139—157, 2000.
[8] C. Ding and I. Dubchak, “Multi-Class Protein Fold
Recognition Using Support Vector Machines and Neural
Networks,” Bioinformatics, vol. 17, pp. 349—358, 2001.
[9] F. Fleuret and D. Geman, “Coarse-to-Fine Face
Detection,” International Journal of Computer Vision,
vol. 41, no. 1/2, pp. 85—107, January 2001.
[10] Y. Freund, “Boosting a Weak Learning Algorithm by
Majority,” Information and Computation, vol. 121, no. 2,
pp. 256—285, 1995.
[11] Y. Freund, “An Adaptive Version of the Boost by Majority
Algorithm,” Machine Learning, vol. 43, no. 3, pp. 293—
318, 2001.
[12] Y. Freund and R. E. Schapire, “A Decision-Theoretic
Generalization of On-Line Learning and an Application to
Boosting,” Proceedings of European Conference on
Computational Learning Theory, pp. 23—37, Barcelona,
Spain, 1995.
[13] Y. Freund and R. E. Schapire, “A Short Introduction to
Boosting,” Journal of Japanese Society for Artificial
Intelligence, vol. 14, pp. 771—780, 1999.
[14] J. Friedman, T. Hastie, and R. Tibshirani, “Additive
Logistic Regression: A Statistical View of Boosting,”
Tech. Rep., Dept. of Statistics, Stanford University
Technical Report, 1998.
[15] G. Guo, H. J. Zhang, and S. Li, “Boosting for Content-
Based Audio Classification and Retrieval: An
Evaluation,” Proceedings of International Conference on
Multimedia and Expo, Tokyo, Japan, 2001.
[16] C. C. Han, H. Y. Liao, G. J. Yu, and L. H. Chen, “Fast
Face Detection via Morphology-Based Pre-Processing,”
Pattern Recognition, vol. 33, no. 10, pp. 1701—
1712, October 2000.
[17] B. Heisele, P. Ho, and T. Poggio, “Face Recognition with
Support Vector Machines:Global versus Component-Based
Approach,” Proceedings of International Conference on
Computer Vision, vol. 2, pp. 688—894, Vancouver, Canada,
2001.
[18] R. L. Hsu, M. Abdel-Mottaleb, and A. K. Jain, “Face
Detection in Color Images,”IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 24, no. 5, pp.
696—706, May 2002.
[19] K. I. Kim, K. Jung, S. H. Park, and H. J. Kim, “Support
Vector Machines for Texture Classification,” IEEE
Transactions on Pattern Analysis and Machine Intelligence,
vol. 24, no. 11, pp. 1542—1550, November 2002.
[20] C. Kotropoulos and I. Pitas, “Rule-Based Face Detection
in Frontal Views,” Proceedings of International
Conference on Acoustics, Speech and Signal Processing,
vol. 4, pp. 2537—2540, Munich, Germany, 1997.
[21] G. Lebanon and J. Lafferty, “Boosting and Maximum
Likelihood for Exponential Models,” Advances in Neural
Information Processing Systems, pp. 447—451, Vancouver,
Canada, 2002.
[22] S. Li, X.W. Hou, H. J. Zhang, and Q. S. Cheng, “Learning
Spatially Localized, Parts-Based Representation,”
Proceedings of Conference on Computer Vision and Pattern
Recognition, vol. 1, pp. 207—212, Kauai, HI, USA, 2001.
[23] S. Li, Z. Q. Zhang, H. Shum, and H. J. Zhang,
“FloatBoost Learning for Classification,” Advances in
Neural Information Processing Systems, Vancouver, Canada,
2002.
[24] S. Li, L. Zhu, Z. Zhang, A. Blake, H. Zhang, and H. Shum,
“Statistical Learning of Multi-View Face Detection,”
Proceedings of European Conference on Computer
Vision, vol. 4, pp. 67—81, Copenhagen, Denmark, 2002.
[25] R. Lienhart and J. Maydt, “An Extended Set of Haar-Like
Features for Rapid Object Detection,” Proceedings of
International Conference on Image Processing, vol. 1, pp.
900—903, Rochester, NY, USA, 2002.
[26] A. C. Loui, C. N. Judice, and S. Liu, “An Image Database
for Benchmarking of Automatic Face Detection and
Recognition Algorithms,” Proceedings of International
Conference on Image Processing, pp. 146—150, Chicago,
USA, 1998.
[27] J. Miao, B. Lin, K.Wang, L. Shen, and X. Chen, “A
Hierarchical Multiscale and Multiangle System for Human
Face Detection in a Complex Background Using Gravity-
Center Template,” Pattern Recognition, vol. 32, no. 7,
pp. 1237—1248, 1999.
[28] B. Moghaddam and A. P. Pentland, “Probabilistic Visual
Learning for Object Representation,”IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 19,
no. 7, pp. 696—710, July 1997.
[29] E. Osuna, R. Freund, and F. Girosi, “Training Support
Vector Machines: An Application to Face Detection,”
Proceedings of Conference on Computer Vision and Pattern
Recognition, pp. 130—136, San Jaun, Puerto Rico, 1997.
[30] C. P. Papageorgiou, M. Oren, and T. Poggio, “A General
Framework for Object Detection,” Proceedings of
International Conference on Computer Vision, pp. 555—
562, Bombay, India, 1998.
[31] G. R¨atsch, S. Mika, B. Sch¨olkopf, and K. R. M¨uller,
“Constructing Boosting Algorithms from SVMs: An
Application to One-Class Classification,” IEEE
Transactions on Pattern Analysis and Machine
Intelligence, vol. 24, no. 9, pp. 1184—1199, 2002.
[32] G. R¨atsch, T. Onoda, and K.-R. M¨uller, “Soft Margins
for AdaBoost,” Machine Learning, vol. 42, no. 3, pp.
287—320, 2001.
[33] S. Romdhani, P. Torr, B. Sch¨olkopf, and A. Blake,
“Computationally Efficient Face Detection,” Proceedings
of International Conference on Computer Vision, vol. 2,
pp.695—700, Vancouver, BC, Canada, 2001.
[34] D. Roth, M. Yang, and N. Ahuja, “A SNoW-Based Face
Detector,” Advances in Neural Information Processing
Systems, pp. 855—861, Denver, CO, USA, 2000.
[35] H. Rowley, S. Baluja, and T. Kanade, “Rotation Invariant
Neural Network-based Face Detection,” Proceedings of
Conference on Computer Vision and Pattern Recognition,
pp. 38—44, Santa Barbara, CA, USA, 1998.
[36] R. E. Schapire, “The Strength of Weak Learnability,”
Machine Learning, vol. 5, no.2, pp. 197—227, 1990.
[37] R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee,
“Boosting the Margin: A New Explanation for the
Effectiveness of Voting Methods,” Proceedings of
International Conference on Machine Learning, pp. 322—
330, Nashville, TN, USA, 1997.
[38] H. Schneiderman and T. Kanade, “A Statistical Approach
to 3D Object Detection Applied to Faces and Cars,”
Proceedings of Conference on Computer Vision and Pattern
Recognition, vol. 1, pp. 746—751, Hilton Head, SC, USA,
2000.
[39] H. Schwenk, “Using Boosting to Improve a Hybrid
HMM/Neural Network Speech Recognizer,” Proceedings of
International Conference on Acoustics, Speech, and
Signal Processing, pp. 1009—1012, Phoenix, Arizona, USA,
1999.
[40] H. Schwenk and Y. Bengio, “Adaboosting Neural Network:
An Application to On-Line Character Recognition,”
Proceedings of International Conference on Artificial
Neural Networks, pp. 967—972, Lausanne, Switzerland,
1997.
[41] K. K. Sung and T. Poggio, “Example-Based Learning for
View-Based Human Face Detection,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol.20, no. 1,
pp. 39—51, January 1998.
[42] V. Vapnik, Statistical Learning Theory, Wiley, New York,
1998.
[43] P. Viola and M. Jones, “Rapid Object Detection Using a
Boosted Cascade of Simple Features,” Proceedings of
Conference on Computer Vision and Pattern Recognition,
vol. 1, pp. 511—518, Kauai, HI, USA, 2001.
[44] H.Wu, Q. Chen, and M. Yachida, “Face Detection from
Color Images Using a Fuzzy Pattern Matching Method,”
IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 21, no. 6, pp. 557—563, June 1999.
[45] M. H. Yang, N. Abuja, and D. Kriegman, “Face Detection
Using Mixtures of Linear Subspaces,” Proceedings of
International Conference on Automatic Face and Gesture
Recognition, pp. 70—76, Grenoble, France, 2000.
[46] M. H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting
Faces in Images: A Survey,”IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 24, no. 1, pp.
34—58, January 2002.
[47] A. Yuille, P. Hallinan, and D. Cohen, “Feature
Extraction from Faces Using Deformable Templates,”
International Journal of Computer Vision, vol. 8, no. 2,
pp.99—111, 1992.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top