跳到主要內容

臺灣博碩士論文加值系統

(216.73.217.5) 您好!臺灣時間:2026/06/08 13:27
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:許徑嘉
研究生(外文):Ching-chia Hsu
論文名稱:基於稀疏表示之人臉驗證與唇語辨識系統
論文名稱(外文):Face Verification and Lip Reading Systems based on Sparse Representation
指導教授:王家慶
指導教授(外文):Jia-Ching Wang
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:76
中文關鍵詞:稀疏表示
外文關鍵詞:sparse representation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:288
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
人臉驗證的應用範圍很廣,如何將其用於真實世界一直是眾多學者研究的議題,我們對人臉擷取SIFT參數,其對於旋轉、平移和尺度皆有不變的特性,並用其來建立稀疏表示的字典,藉由K-means以及資訊理論,我們提出兩種擴增字典的方法,實驗結果顯示,藉由擴增字典,可以有效的增加稀疏係數的稀疏性,並改善驗證率以及重建訊號的殘餘值。本論文利用BCS求解最佳化問題,相較於以往的OMP演算法,BCS除了求解最佳化問題外,所獲得的共變異數可以用於改善遞增字典,以降低觀測向量的不確定性,實驗結果顯示,遞增字典確實可使重建訊號的殘餘值減少。

傳統唇語辨識都是用ASM或AAM取得唇形作為參數,可能會遺失部分有用的資訊,本論文考慮唇語的整體影像,利用SIFT作為參數,藉由BOF,可以將多個SIFT特徵點轉化為向量,並利用其訓練HMM模型。我們測試英文字母A~Z,其實驗結果也好於Baseline系統。

Face verification has many applications. The critical problem which lots of researchers concern is how to apply to real-world. In order to robust orientation, translation and scaling of face images, we extract SIFT features of face images which is built dictionary of sparse representation. We propose two kinds of method to extend dictionary via K-means and information theory(extended dictionary and incremental dictionary). Experiments show that we can increase sparseness of sparse coefficients efficiently, also can improve verification rate and reconstruction error via extended dictionary. This paper utilize BCS to solve optimization problem. Compare to OMP algorithm, BCS not only can solve optimization problem but also can improve dictionary by covariance which can decrease uncertainty of observation vectors. Experiments show that incremental dictionary do increases residual of reconstruction error.

Lip reading has utilized ASM or AAM as features past few years. We concern that it might lose some useful information, therefore we consider whole image information by extracting SIFT features. In order to train HMM model via SIFT features, we utilize BOF to transform matrices of SIFT features into vectors. We experiment letters A-Z, and the result show that performance of proposed method is better than baseline systems.

摘要 i
Abstract ii
圖目錄 iii
表目錄 v
章節目次 vi
第1章 緒論 - 1 -
1.1 前言 - 1 -
1.2 研究動機與目的 - 2 -
1.3 論文架構 - 3 -
第2章 文獻探討 - 5 -
2.1 Eigenface和Fisherface - 5 -
2.2 區域保留投影(Locality Preserving Projection, LPP) - 6 -
2.3 Histogram of Gabor Phase Pattern(HGPP) - 6 -
2.4 區域二元特徵(Local Binary Patterns, LBP) - 7 -
2.5 分類器(Classifier) - 7 -
第3章 稀疏表示(Sparse Representation) - 8 -
3-1 稀疏表示問題 - 8 -
3-2 應用於人臉辨識之稀疏表示問題 - 9 -
第4章 研究方法 - 12 -
4-1 貝式壓縮感測(Bayesian Compressive Sensing) - 12 -
4-1-1 稀疏事前機率(Sparseness Prior) - 12 -
4-1-2 透過Relevance Vector Machine估測稀疏係數 - 13 -
4-2 SIFT特徵參數 - 16 -
4-2-1 Detect scale-space extrema - 17 -
4-2-2 Keypoint localization - 20 -
4-2-3 Orientation assignment and Generate image descriptor - 21 -
4-3 建立字典 - 22 -
4-4 人臉驗證 - 23 -
4-5 擴增字典 - 25 -
4-5-1 K-Means群聚演算法 - 25 -
4-5-2 利用K-means建立擴增字典 - 26 -
4-6 人臉驗證演算法 - 28 -
4-7 遞增字典(Incremental Dictionary) - 29 -
第5章 唇語辨識 - 32 -
5-1 Bag-of-Features(BOF) - 32 -
5-1-1 BOF應用於SIFT特徵參數 - 33 -
5-2 隱藏馬可夫模型 - 35 -
5-2-1 向前演算法(Forward Algorithm) - 37 -
5-2-2 EM演算法 - 37 -
5-3 Bayesian Sensing Hidden Markov Model - 39 -
第6章 實驗結果 - 40 -
6-1 Baseline系統比較 - 41 -
6-1-1 Extended YaleB資料庫 - 41 -
6-1-2 LFW資料庫 - 45 -
6-2 不同群聚中心個數的比較 - 46 -
6-3 分類器效能比較 - 47 -
6-4 稀疏性(Sparseness)比較 - 49 -
6-5 遞增字典(Incremental Dictionary) - 53 -
6-5-1 遞增字典殘餘值比較 - 53 -
6-5-2 遞增字典與隨機字典比較 - 54 -
6-5-3 遞增字典收斂變化 - 55 -
6-6 唇語辨識 - 56 -
第7章 結論與未來 - 57 -
參考文獻 - 58 -
附錄一 Extended YaleB資料庫 - 63 -
附錄二 LFW資料庫 - 65 -

[1] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, pp. 210–227, Feb. 2009.
[2] P. Nagesh, and B. Li, “A compressive sensing approach for expression-invariant face recognition,” IEEE Conf. Computer Vision and Pattern Recognition., pp. 1518 – 1525, June 2009.
[3] Z. Zeng, H. Li, W. Liang, and S. Zhang, “Similarity- Towards image classification via kernelized sparse representation,” IEEE conf. Image Processing, pp. 277-280, Sept. 2010.
[4] W. Dong, L. Zhang: G. Shi, and X. Wu, “Image deblurring and super-resolution by adaptive sparse domain selection and adaptive,” IEEE trans. Signal Process., vol. 20, no. 20, pp. 1838-1857, Jul. 2011.
[5] J. Yang, J. Wright, T. A. Huang, and Y. Ma, “Image Super-Resolution Via Sparse Representation,” IEEE Trans Signal Process., vol. 19, no. 11, pp. 2861-2873, Nov. 2010.
[6] M. Elad and M. Aharon, “ Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans Signal Process., vol. 15, no. 12, pp. 3736-3745, Dec. 2006.
[7] P. Chatterjee and P. Milanfar, “Patch-based near-optimal image denoising,” IEEE Trans. Signal Process., vol. 21, no. 4, pp. 1635-1649, Apr. 2012.
[8] D. Donoho, “For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution,” Comm. On Pure and Applied Math, vol. 59, no. 6, pp. 797–829, 2006.
[9] E. Cand`es, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Comm. on Pure and Applied Math, vol. 59, no. 8, pp. 1207–1223, 2006.
[10] E. Cand`es and T. Tao, “Near-optimal signal recovery from random projections: Universal encoding strategies?” IEEE Trans. Information Theory, vol. 52, no. 12, pp. 5406–5425, 2006.
[11] J. A. Tropp and A. C. Gilbert, “Signal recovery from partial information via orthogonal matching pursuit,” Apr. 2005, Preprint.
[12] D. L. Donoho, Y. Tsaig, I. Drori, and J.-C. Starck, “Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit,” Mar. 2006, Preprint.
[13] S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM Journal on Scientific Computing, vol. 20, no. 1, pp. 33–61, 1999.
[14] R. Tibshirani, “Regression shrinkage and selection via the LASSO,” Journal of the Royal Statistical Society(Series B), vol. 58, pp. 267-288, 1996.
[15] S. Ji and Y. Xue, “Bayesian compressive sensing,” IEEE trans. Signal Processing, vol. 56, June 2008.
[16] M. E. Tipping, “Sparse Bayesian learning and the relevance vector machine,” Journal of Machine Learning Research, vol. 1, pp. 211–244, 2001.
[17] T. M. Cover and J. A. Thomas, Elements of information theory. New York, NY: Wiley, 1991.
[18] M.A. Turk and A.P. Pentland, "Face recognition using eigenfaces," IEEE conf. Computer Vision and Pattern Recognition, pp.586-591, Jun. 1991.
[19] P.N Belhumeur, J.P. Hespanha, and D.J. Kriegman,, "Eigenfaces vs. Fisherfaces: recognition using class specific linear projection," IEEE Transactions, Pattern Analysis and Machine Intelligence, vol. 19, pp. 711-720, Jul. 1997.
[20] J. Wright, A. Wagner, A. Ganesh, Z. Zhou and Y. Ma, “ Towards a Practical Face Recognition System: Robust Registration and Illumination via Sparse Representation,” IEEE Computer Vision and Pattern Recognition, pp. 597-604, June 2009.
[21] D. Needell and R. Vershynin, “Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit,” IEEE J. Selected Topics Signal Process., vol. 4, no. 2, pp. 310-316, Apr. 2010
[22] E. Cand`es, “Compressive sampling,” in Proceedings of the International Congress of Mathematicians, 2006.
[23] L. W. Kang, C. Y. Hsu, H. W. Chen, C. S. Lu, C. Y. Lin and S. C. Pei, “Feature-Based Sparse Representation for Image Similarity Assessment,” in IEEE Transactions on Multimedia, vol. 13, no. 5, Oct. 2011.
[24] T. Ahonen, A. Hadid, and M. Pietika¨inen, “Face Description with Local Binary Patterns: Application to Face Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 12, pp. 2037-2041, Dec. 2006.
[25] L. Zhu, Y. L. Zhu, H. Mao, and M. H. Gu, “A new method for sparse signal denoising based on compressed sensing,” Int. Symp. Knowledge Acquisition and Modeling, 2009, pp. 35-38.
[26] S. G. Mallat and Z. F. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE Trans. Signal Process., vol. 41, no. 12, pp. 3397-3415, Dec. 1993.
[27] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. Inf. Theory, vol. 53, pp. 4655-4666, 2007.
[28] D. Needell and R. Vershynin, “Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit,” IEEE J. Selected Topics Signal Process., vol. 4, no. 2, pp. 310-316, Apr. 2010
[29] A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, "From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose", IEEE Trans. Pattern Anal. Mach. Intelligence, vol. 23, no.6, pp. 643-660, 2001.
[30] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” University of Massachusetts, Amherst, Tech. Rep. 07-49, October 2007, http://vis-www.cs.umass.edu/lfw/.
[31] J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” in Proc. IEEE Int. Conf. Computer Vision, Nice, France, Oct. 2003, vol. 2, pp. 1470–1477.
[32] T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active Appearance Models,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, June 2001.
[33] T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active Appearance Models,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, June 2001.
[34] B. Zhang S. Shan, X. Chen and W. Gao, “Histogram of Gabor phase Patterns(HGPP): A Novel Object Representation Approach for Face Recognition,” IEEE Transactions on Image Processing, pp. 57-68, 2007.
[35] M. E. Tipping, “Sparse Bayesian learning and the relevance vector machine,” Journal of Machine Learning Research, vol. 1, pp. 211–244, 2001.
[36] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60, no. 2, pp. 91–110, 2004.
[37] X. He, S. Yan Y. Ho, P. Niyogi and J. Zhang, “Face recognition using Laplacianfaces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, 2005.
[38] D. Cai, X. He, J. Han, and H. Zhang, “Orthogonal Laplacianfaces for face recognition,” IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3608-3614, 2006.
[39] B. Schölkopf, A. Smola, and K. R. Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural Computation, vol. 10, no. 5, pp. 1299–1319, 1998.
[40] S.Mika, G. Ra¨tsch, J.Weston, B. Scho¨lkopf, and K.-R.Mu¨ller, “Fisher Discriminant Analysis with Kernels,” Proc. IEEE Int’l Workshop Neural Networks for Signal Processing IX, pp. 41-48, Aug. 1999.
[41] J. Yang, D Zhang, A. Frangi, and J. Yang, “Two-dimensional PCA: A new approach to appearance-based face representation and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 131-137, 2005.
[42] H. Xiong, M. N. S. Swamy and M. O. Ahmad, “Two-dimensional FLD for face recognition,” Pattern Recognition, vol. 38, pp. 1121-1124, 2005.
[43] C. Liu and H. Wechsler, “Gabor Feature Based Classification Using the Enhanced Fisher Linear Discriminant Model for Face Recognition” IEEE Transaction on Image Processing, vol. 11, no. 4, pp. 467-476, 2002.
[44] J. Ho, M. Yang, J. Lim, K. Lee, and D. Kriegman, “Clustering appearances of objects under varying illumination conditions,” in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2003, pp. 11–18.
[45] C. Chang and C. Lin, LIBSVM: A Library for Support Vector Machines, 2001, Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[46] R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. John Wiley & Sons, 2001.
[47] I. Matthews, T. F. Cootes, J. A. Bangham, S. Cox, and R. Harvey, “Extraction of visual features for lipreading,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 198–213, 2002.
[48] G. Saon and J. T. Chien, “Bayesian Sensing Hidden Markov Models,” IEEE Trans. Audio, Speech and Language Processing, vol. 20, no. 1, January 2012.
[49] I. Matthews, T. F. Cootes, J. A. Bangham, S. Cox, and R. Harvey, “Extraction of visual features for lipreading,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 198–213, 2002.

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top