跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.174) 您好!臺灣時間:2024/12/03 20:56
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳詠倫
研究生(外文):Yung-Lun Chen
論文名稱:基於多主平面分析之動態規劃平台及其在三維人體姿勢分析
論文名稱(外文):A Dynamic Programming Framework for Modeling and Recognizing 3D Human Body Gestures through Multiple Principal Plane Analysis
指導教授:鄭錫齊鄭錫齊引用關係張欽圳
指導教授(外文):Shyi-Chyi ChengChin-Chun Chang
學位類別:碩士
校院名稱:國立臺灣海洋大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:102
語文別:中文
論文頁數:69
中文關鍵詞:多主平面分析三維人體姿勢分析
外文關鍵詞:Multiple Principal Plane Analysis3D human bodyRecognizing Gestures
相關次數:
  • 被引用被引用:0
  • 點閱點閱:330
  • 評分評分:
  • 下載下載:65
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出一個基於多平面分析之動態規劃平台及其在三維人體姿勢辨識的新架構。演算法首先利用著名的k-means分群演算法和主平面分析(Principal Plane Analysis)來近似3D外形。對於每個主平面我們擷取旋轉、大小和轉換不變的平面描述特徵,用來描述3D外形的內容資訊,接著我們利用先進的bag-of-words(BoW)方法來描述3D外形。在模型視角與人體尺寸不變下,其3D外形之間的相似度估計是利用計算兩個BoW直方圖間的差異。基於這個相似度的測量,分群使訓練3D外形可獲得一個3D外形字典, 這個3D外形字典可用來標示輸入的3D資料序列的姿勢序列,再利用拓樸排序(topological sort),輸入之3D資料序列可表示為一個經過正規化的關鍵姿勢序列,以確保屬於相同動作類別的所有序列都具有相同的起始外形姿勢。最後,訓練階段得到的每個動作的樣板資料序列和使用字串核心(string kernel)之支援向量分類器(Support Vector Machine,SVM) ,可用於辨識輸入3D資料序列的關鍵姿勢之集合分類其動作類型。實驗結果顯示本論文所提出的演算法可獲得良好的分類準確度。
壹、 緒論 1
1.1 研究動機 1
1.2 研究背景 1
1.3 研究方法簡介 3
1.4 論文架構 4
貳、 相關研究 5
參、 主平面分析 8
3.1 k-means 分群演算法 8
3.2 3D模型之主平面分析 9
3.3 建立3D外形逼近模型 11
3.3.1 15
3.3.2 17
肆、 利用BoW估算3D外形的距離 19
4.1 表面描述特徵擷取 19
4.2 估計3D外形距離 24
伍、 3D人體姿勢辨識 25
5.1 BoW方法之關鍵姿勢偵測 26
5.2 優化3D外形碼本 29
陸、 實驗結果 32
6.1 資料集合 32
6.2 多主平面分析 35
6.3 未使特定姿勢碼本之關鍵姿勢分析 37
6.4 使用為特定姿勢碼本之關鍵姿勢分析 40
6.5 混淆矩陣 45
柒、 結論與未來展望 52
參考文件 53


[1]N. Wrghim and Y. Xiao, Posture Recognition and Segmentation from 3D Human Body Scans, in Proc. 3D Data Processing Visualization and Transmission, 2002, 636-639
[2] B. Boulay, F. Bremond and M. Thonnat , Posture recognition with a 3D human model, CDP 2005, Imaging for Crime Detection and Prevention, 2005, 135-138.
[3] C. HU, Q. YU, Y. LI and S. MA, Extraction of parametric human model for posture recognition using genetic algorithm, in Proc. Automatic Face and Gesture Recognition, 2000, 518-523.
[4] R. Li, M. Tang, S. Sclaroff and T. Tian, Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers, Internation Journal of Computer Vision, 87(2010), 170-190.
[5] T. Zhao and R. Nevatia, 3D tracking of human locomotion: a tracking as recognition approach, Pattern Recognition, 1(2002), 546-551.
[6] S. Ali, M. Shah, Human action recognition in videos using kinematic features and multiple instance learning, IEEE Trans. Pattern Analysis and Machine Intelligence, 32 (2) (2010), 288-302.
[7] I. Laptev, B. Caputo, C. Schuldt, T. Lindeberg, Local velocity-adapted motion events for spatio-temporal recognition, Computer Vision and Image Understanding, 108 (3) (2007), 207-229.
[8] S. Nowozin, G. Bakur, K. Tsuda, Discriminative subsequence mining for action classification, Proc. Int’l Conf. Computer Vision, 2007, 1-8.
[9] B. Yao, S.-C. Zhu, Learning deformable action templates from cluttered videos, Proc. Int’l Conf. Computer Vision, 2009.
[10] J. Neibles, H. Wang, F. Li, Unsupervised learning of human action categories using spatial temporal words, Proc. British Machine Vision Conf., 2005.
[11] A. Yao, J. Gall, L. V. Gool, A Hough transform-based voting framework for action recognition, in IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[12] M. Marcon, A. Sarti, S. Tubaro and M. Poerobon, A Framework for Interpreting, Modeling and Recognizing Human Body Gestures through Eigenpostures and Hidden Markov Models, Pattern Recognition, 2012.
[13] J. Neibles, H. Wang, F. Li, Unsupervised learning of human action categories using spatial temporal words, Proc. British Machine Vision Conf., 2005.
[14] A. Yao, J. Gall, L. V. Gool, A Hough transform-based voting framework for action recognition, in IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[15] P. Scovanner, A. Ali, M. Shah, A 3-dimensional SIFT descriptor and its application to action recognition, Proc. ACM Int’l Conf. Multimedia, 2007.
[16] I. Laptev, M. Marszalek, C. Schmid, B. Rozenfeld, Learning realistic human actions from movies, in IEEE Computer Vision and Pattern Recognition, 2008.
[17] J. Liu, J. Luo, M. Shah, Recognizing realistic actions from videos 'in the wild', in IEEE Computer Vision and Pattern Recognition, 2009.
[18] N. Ikizler and P. Duygulu, Human Action Recognition Using Distribution of Oriented Rectangular Patches, LNCS, 2007, 271-284.
[19] C. Yang, U. Guo, H.S. Sawhney and R. Kumar, Learning Actions Using Robust String Kernels, LNCS. 4814(2007), 313-327.
[20] L. Ballan, M. Bertini, A.D. Bimbo, G. Serra, Video Event Classification Using Bag of Words and String Kernels, ICIAP, 5726(2009), 170-178.
[21] S.-C. Cheng, C.-T. Kuo, and D.-C. Wu, “A Novel 3D Mesh Compression Using Mesh Segmentation with Multiple Principal Plane Analysis," Pattern Recognition, Vol. 43, No. 1, 2010, pp. 261-279.
[22] T. B. Moeslund and E. Granum, “A survey of computer
vision-based human motion capture,” Comp. Vis. and Image Underst., 81(2001), no. 3, pp. 231–268.
[23] H. Saito, S. Baba, M. Kimura and S. Vedula, Appearance-based virtual view generation of temporally-varying events from multi-camera inages in the 3D room, in Proc. 1999, 516-525.
[24] A. Maimone and H. Fuchs, Encumbrance-free telepresence system with real-time 3D capture and display using commodity depth cameras, ISMAR. 2011, 137-146.
[25] A. Laurentini, “The visual hull concept for silhouette-based image understanding, Pattern Analysis and Machine Intelligence, IEEE Transactions , 16(1994), 150–162.
[26] K.S. Huang and M.M. Trivedi, 3D Shape Context Based Gesture Analysis Integrated with Tracking using Omni Video Array, in: Computer Vision and Pattern Recognition (CVPR), 2005 IEEE, 3(2005), 80.
[27] R. Fablet and M.J. Black, Automatic Detection and Tracking of Human Motion with a View-Based Representation, in Proc. ECCV'02, 2002, 476-491
[28] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, A. Y. Wu, An efficient k-means clustering algorithm: Analysis and implementation, IEEE Trans. Pattern Analysis and Machine Intelligence 24 (2002): 881–892.
[29] C, Keskin, A. Erkan and L. Akarun, Real Time Hand Tracking and 3D Gesture Recognition for Interactive Interfaces using HMM, In Proceedings of international Conference on Artificial Neural Network. 2003
[30] F.G. Hofmann, P. Heyer and G. Hommel, Velocity Profile Based Recognition of Dynamic Gestures Discrete Hidden Markov Models, Lecture Notes in Computer Science, 1271(1998), 81-95
[31] Z.He and J. Lianwen, Activity recognition from acceleration data based on discrete consine transform and SVM, SMC 2009, 5041-5044.
[32] T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu, An efficient k-means clustering algorithm: Analysis and implementation, IEEE Trans. Pattern Analysis and Machine Intelligence, 24 (2002), 881-892.
[33] J. B. Mac Queen, Some methods for classification and analysis of multivariate observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1 (1967), 281-297.
[34] C.-T. Kuo and S.-C. Cheng, 3D model retrieval using principal plane analysis and dynamic programming, Pattern Recognition, 40 (2) (2007), 742-755.
[35] S. Shlafman, A. Tal, and S. Katz, Metamorphosis of polyhedral surfaces using decomposition, Computer Graphics Forum, 21 (3) (2002), 219-228.
[36] Y. Chen and G. Medioni, Object modeling by registration of multiple range images, in Proc. IEEE Int’l Conf. Robotics and Automation, 1991, pp. 2724-2729.
[37] S. Gao, L. Chia, I. Tsang, Multi-layer group sparse coding for concurrent image classification and annotation, in: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 2809–2816.
[38] Zhang, Y. Jin, R., and Zhou, Z.-H. 2010. Understanding bag-of-words model: A statistical framework. International Journal of Machine Learning and Cybernetics, 1, 1, 43-52.
[39] Ballan, L., Bertini, M., Bimbo, A. D., Seidenari, L., and Serra, G. Event detection and recognition for semantic annotation of video. Multimedia Tools and Applications 51, 1, 2011, pp. 279-302.
[40] D. Ballard, Generalizing the Hough transform to detect arbitrary shapes, J. Pattern Recognition, 13 (1981) 111-122.
[41] S.-C. Cheng, C.-T. Kuo, and H.-J. Chen, Visual Object Retrieval via Block-Based Visual Pattern Matching, J. Pattern Recognition, 40 (2007) 1695-1710.
[42] P. Beyerlein, Discriminative model combination, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, 1997.
[43] H. Ruppertshofen, C. Lorenz, S. Schmidt, P. Beyerlein, Z. Salah, G. Rose, and H. Schramm, “Iterative Training of Discriminative Models for the Generalized Hough Transform,” B. Menze et al. (Eds.): MICCAI 2010 Workshop MCV, LNCS 6533, pp. 21–30, 2011.
[44] B. E. Boser, I. M. Guyon, and V. N. Vapnik, “A training algorithm for optimal margin classifiers,” In Proc. of ACM Int’l Workshop on Computational Learning Theory, 1992.
[45] K. Burham and D. Anderson, “Multimodel inference: Understanding Aic and Bic in model selection,” Sociological Methods and Research, 33, 2004, pp.261-304.
[46] T. Liu, J. R. Kender, “Computational approaches to temporal sampling of video sequences,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 3,2, 2007, pp. 7-31.
[47] W.-H. Tsai, Moment preserving thresholding: A new approach, Comput. Vis., Graph., Image Process., 1984, 377-393.
[48] J. Shawe-Taylor and N. Cristianini, Kernel methods for pattern analysis. Cambridge University Press, New York, 2004.
[49] http://www- dsp.elet.polimi.it/ispg/index.php/description.html, ISPG 2013
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top