跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.88) 您好!臺灣時間:2026/02/15 19:17
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:謝中揚
研究生(外文):Hsieh Chung-Yang
論文名稱:基於視覺化訊息之人體動作及手勢辨識之研究
論文名稱(外文):A Study on Vision-based Human Action and Hand Gesture Recognition
指導教授:林維暘
指導教授(外文):Lin Wei-Yang
口試委員:郭景明林惠勇朱威達劉偉名林維暘
口試日期:2015-11-13
學位類別:博士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:104
語文別:英文
論文頁數:92
中文關鍵詞:手勢辨識動作辨識軌跡辨識
外文關鍵詞:Hand gesture recognitionAction recognitionTrajectory recognition
相關次數:
  • 被引用被引用:0
  • 點閱點閱:464
  • 評分評分:
  • 下載下載:12
  • 收藏至我的研究室書目清單書目收藏:0
本論文開發了三個用於手勢及動作辨識的系統。第一個系統透過手部的移動軌跡來辨識及檢索手語的涵意。此系統首先將軌跡以Kernel Principal Component Analysis (KPCA)投影至高維空間,在高維空間中將更有利於軌跡的分類。接著使用Nonparametric Discriminant Analysis (NDA)擷取出KPCA空間中最具鑑別度的特徵。KPCA與NDA的協同作業使得軌跡類別間能有更好的分離性。實驗時我們使用Australian Sign Language (ASL)資料庫來驗證本系統。而實驗結果亦顯示本系統在軌跡的分類及檢索上比起近年的文獻擁有更好的效果。
第二個系統則是使用影片做為資料來源,來辨識影片中的動作類別。該系統是基於local learning boosting algorithm。主要的概念是使用local learner來制定出一個高精度的分類規則。該系統首先擷取出interest points所形成的點群,並利用這些點群建立出一系列具鑑別力的特徵。這些特徵包含了靜態特性(例如人的長寬比)及動態特性。接著我們對這些特徵以local learning建立locally adaptive classifiers。每一個local classifier均是由一個訓練樣本所建立。由於每個local classifier均能很好的描述資料的局部分佈特性,因此結合這些local classifiers將可預期能得到更好的分類效果。我們在KTH資料庫上做了不同的實驗。實驗結果顯示我們所提出之系統能與近期文獻中的系統有相近的效能。此外,比起使用global learning的AdaBoost,local learning在訓練迭代時擁有更佳的效率。
第三個系統用來辨識手勢及動作。在手勢部份,我們直接以影片作為資料來源,而非軌跡。在動作辨識部份,我們也使用較符合實際的動作影片來測試系統。該系統使用雙互補張量來辨識手勢及動作。此方法將影片正規化成二個簡單且具鑑別度的張量。第一個張量為原生影片所產生。另一個張量則是由影片中的Histogram of Oriented Gradients (HOG)特徵所產生。每個張量被分解成矩陣後由典型相關分析(Canonical Correlation Analysis, CCA)評估其相似度。此外,我們提出一個資訊融合的方法來合併張量間的矩陣相似度。該融合方式能有效強化不同動作類別間的鑑別度以得到更佳的辨識率。我們在兩個公開資料庫(UCF sports和Cambridge-Gesture)上測試我們的系統,而結果亦顯示我們的系統與近代文獻能有相近的辨識率。
In this dissertation, we develop three systems for gesture and action recognition. First system uses the trajectory of hand motion as the datum source for sign language classification and retrieval. In this system, a trajectory is firstly projected by the Kernel Principal Component Analysis (KPCA) which can be considered as an implicit mapping to a much higher-dimensional feature space. The high dimensionality can effectively improve the accuracy in recognizing motion trajectories. Then, Nonparametric Discriminant Analysis (NDA) is used to extract the most discriminative features from the KPCA feature space. The synergistic effect of KPCA and NDA leads to better class separability and makes the proposed trajectory representation a more powerful discriminator. The experimental validation of the proposed method is conducted on the Australian Sign Language (ASL) data set. The results show that our method performs significantly better, in both trajectory classification and retrieval, than the state-of-the-art techniques.
Second system uses video clip as the input data to classify action category. This system is based on the local learning boosting algorithm. The idea of local learning is to use the local learner to form a highly accurate classification rule. In this system, we first extract the cloud of interest points from video and use them to construct more discriminative features. These features encode not only the static characteristics such as the aspect ratio of the human, but also body dynamics per action class. Then, we perform an efficient local learning on the extracted features to learn locally adaptive classifiers. In particular, a local classifier is specifically trained for each training sample. A local classifier could better describe the local data distribution and thus combining multiple local classifiers would lead to better classification accuracy. We conduct several experiments on the KTH dataset and obtain very inspiring results. Our approach achieves comparable performance to that of the state-of-the-art methods. Compared with a popular method for global learning, i.e., the AdaBoost, the local learning provides significantly better accuracy with little additional cost in training time.
Third system can be used to recognize gesture or action. In the gesture part, we directly use the video clip as the datum source. In the action part, we use a more realistic dataset to evaluate the performance of our proposed system. This system uses dual-complementary tensors to recognize gesture and human action. In particular, the proposed method constructs a compact and yet discriminative representation by normalizing the input video volume into dual tensors. One tensor is obtained from the raw video volume data and the other one is obtained from the histogram of oriented gradients (HOG) features. Each tensor is converted to factored matrices and the similarity between factored matrices is evaluated using canonical correlation analysis (CCA). We, furthermore, propose an information fusion method to combine the resulting similarity scores. The proposed fusion strategy can effectively enhance discriminability between different action categories and lead to better recognition accuracy. We have conducted several experiments on two publicly available databases (UCF sports and Cambridge-Gesture). The results show that our proposed method achieves comparable recognition accuracy as the state-of-the-art methods.
Chapter 1 Introduction .............................................................. 1
1.1 Overview of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Survey on Trajectory Recognition . . . . . . . . . . . . . . . . . . 6
1.2.2 Survey on Human Action and Hand Gesture Recognition with Video Clips . . . . . . . . . . . . . . . . . . . . . 9
1.3 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 2 Kernel-based Representation for 2D/3D Motion Trajectory Retrieval and Classification ..................... 14
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Kernel-based trajectory representation . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Nonlinear kernel mapping . . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 The benefit of using 3D trajectory data . . . . . . . . . . . . . . . . 25
2.3.2 The benefit of using kernel-space representation . . . . . . . . . . . 26
2.3.3 Comparison with other methods . . . . . . . . . . . . . . . . . . . 27
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter 3 2D Video-based Human Action and Hand Gesture Recognition....................................... 33
3.1 Video-based Human Action Recognition using Local Learning Boosting Method . . . . . . . . . . . . . . . . . . 34
3.1.1 Representation for Action Recognition . . . . . . . . . . . . . . . 37
3.1.2 Boosting Classifier for Action Recognition . . . . . . . . . . . . . 45
3.1.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Video-Based Human Action and Hand Gesture Recognition by Fusing Factored Matrices of Dual Tensors . . . . . . . . . . . . . 54
3.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Chapter 4 Conclusions .............................................................. 76
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
[1] Australian sign language dataset, April 1999. http://kdd.ics.uci.edu/databases/auslan/auslan.data.html.
[2] R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. The Journal of Machine Learning Research, 6:1817–1853, 2005.
[3] N. Anjum and A. Cavallaro. Multifeature object trajectory clustering for video analysis. IEEE Transactions on Circuits and Systems for Video Technology, 18(11):1555–1564, 2008.
[4] G. Antonini and J. P. Thiran. Counting pedestrians in video sequences using trajectory clustering. IEEE Transactions on Circuits and Systems for Video Technology, 16(8):1008–1020, 2006.
[5] I. Atmosukarto, N. Ahuja, and B. Ghanem. Action recognition using discriminative structured trajectory groups. In IEEE Winter Conference on Applications of Computer Vision, pages 899–906. IEEE, 2015.
[6] L. Baraldi, F. Paci, G. Serra, L. Benini, and R. Cucchiara. Gesture recognition in ego-centric videos using dense trajectories and hand segmentation. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 702–707. IEEE, 2014.
[7] F. Bashir, A. Khokhar, and D. Schonfeld. Automatic object trajectory-based motion recognition using gaussian mixture models. In IEEE International Conference on Multimedia and Expo, pages 1532 –1535, 2005.
[8] F. I. Bashir, A. A. Khokhar, and D. Schonfeld. Segmented trajectory based indexing and retrieval of video data. In International Conference on Image Processing, volume 2, pages 623–626, 2003.
[9] F. I. Bashir, A. A. Khokhar, and D. Schonfeld. Object trajectory-based activity classification and recognition using hidden Markov models. IEEE Transactions on Image Processing, 16(7):1912–1919, 2007.
[10] F. I. Bashir, A. A. Khokhar, and D. Schonfeld. Real-time motion trajectory-based indexing and retrieval of video sequences. IEEE Transactions on Multimedia, 9(1):58–65, 2007.
[11] D. Batra, T. Chen, and R. Sukthankar. Space-time shapelets for action recognition. In IEEE Workshop on Motion and video Computing, pages 1–6, 2008.
[12] H. Bay, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. In European Conference on Computer Vision, pages 404–417, 2006.
[13] C. M. Bishop. Pattern recognition and machine learning. Springer, 2006.
[14] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri. Actions as space-time shapes. In IEEE International Conference on Computer Vision, volume 2, pages 1395–1402, 2005.
[15] A. Bobick and J. Davis. The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3):257–267, 2001.
[16] L. Chen, H. Liao, M. Ko, J. Lin, and G. Yu. A new LDA-based face recognition system which can solve the small sample size problem. Pattern recognition, 33(10):1713–1726, 2000.
[17] W. Chen and S.-F. Chang. Motion trajectory matching of video objects. In Storage and Retrieval for Media Databases, pages 544–553, 2000.
[18] M. Cristani, R. Raghavendra, A. Del Bue, and V. Murino. Human behavior analysis in video surveillance: A social signal processing perspective. Neurocomputing, 100:86–97, 2013.
[19] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages 886–893. IEEE, 2005.
[20] J. Daugman. Two-dimensional spectral analysis of cortical receptive field profiles. Vision Research, 20(10):847–856, 1980.
[21] R. De Geest and T. Tuytelaars. Dense interest features for video processing. In IEEE International Conference on Image Processing, pages 5771–5775. IEEE, 2014.
[22] X. Deng, X. Liu, M. Song, J. Cheng, J. Bu, and C. Chen. Lf-eme: Local features with elastic manifold embedding for human action recognition. Neurocomputing, 99:144–153, 2013.
[23] P. Doll´ar. Piotr’s Computer Vision Matlab Toolbox (PMT). http://vision.ucsd.edu/˜pdollar/toolbox/doc/index.html.
[24] P. Doll´ar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In IEEE International Workshop on Visual Surveillance
and Performance Evaluation of Tracking and Surveillance, pages 65–72. IEEE, 2005.
[25] P. Doll´ar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In IEEE International Workshop on Visual Surveillance
and Performance Evaluation of Tracking and Surveillance, pages 65–72, 2005.
[26] P. Doll´ar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In IEEE International Workshop on Visual Surveillance
and Performance Evaluation of Tracking and Surveillance, pages 65–72, 2005.
[27] P. Doll´ar, Z. Tu, P. Perona, and S. Belongie. Integral channel features. In The British Machine Vision Conference, volume 2, page 5, 2009.
[28] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience,2 edition, 2000.
[29] A. Dyana and S. Das. Trajectory representation using Gabor features for motionbased video retrieval. Pattern Recognition Letters, 30(10):877–892, 2009.
[30] A. Dyana and S. Das. Mst-css (multi-spectro-temporal curvature scale space), a novel spatio-temporal representation for content-based video retrieval. IEEE Transactions
on Circuits and Systems for Video Technology, 20(8):1080–1094, 2010.
[31] A. Efros, A. Berg, G. Mori, and J. Malik. Recognizing action at a distance. In Proceedings of the International Conference on Computer Vision, volume 2, pages 726–733, 2003.
[32] A. Fathi and G. Mori. Action recognition by learning mid-level motion features. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2008.
[33] B. Fernando, E. Gavves, J. Oramas, A. Ghodrati, and T. Tuytelaars. Modeling video evolution for action recognition. In IEEE Conference on Computer Vision and Pattern
Recognition, volume 2, page 8, 2015.
[34] R. Fraile and S. Maybank. Vehicle trajectory approximation and classification. In British Machine Vision Conference, pages 832–840, 1998.
[35] Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:119–139, 1997.
[36] K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, 1990.
[37] K. Fukunaga and J. M. Mantock. Nonparametric discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(6):671–677, 1983.
[38] L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri. Actions as spacetime shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12):2247–2253, 2007.
[39] T. Guha and R. K. Ward. Learning sparse representations for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(8):1576–1588, 2012.
[40] M. T. Harandi, C. Sanderson, S. Shirazi, and B. C. Lovell. Kernel analysis on grassmann manifolds for action recognition. Pattern Recognition Letters, 34(15):1906–1915, 2013.
[41] C. Harris and M. Stephens. A combined corner and edge detector. In Alvey vision conference, volume 15, pages 147–151, 1988.
[42] P. Hespanha and D. Kriegman. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence,
19(7):711, 1997.
[43] J.-W. Hsieh, S.-L. Yu, and Y.-S. Chen. Motion-based video retrieval by trajectory matching. IEEE Transactions on Circuits and Systems for Video Technology, 16(3):396–409, 2006.
[44] W. Hu, D. Xie, T. Tan, and S. Maybank. Learning activity patterns using fuzzy selforganizing neural network. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 34(3):1618–1626, 2004.
[45] A. Iosifidis, A. Tefas, and I. Pitas. Minimum class variance extreme learning machine for human action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 23(11):1968–1979, 2013.
[46] A. Jain, K. Nandakumar, and A. Ross. Score normalization in multimodal biometric systems. Pattern recognition, 38(12):2270–2285, 2005.
[47] H. Jhuang, T. Serre, L. Wolf, and T. Poggio. A biologically inspired system for action recognition. In IEEE International Conference On Computer Vision, pages
1–8, 2007.
[48] S. Jones and L. Shao. Content-based retrieval of human actions from realistic video databases. Information Sciences, 236:56–65, 2013.
[49] S. Jones, L. Shao, J. Zhang, and Y. Liu. Relevance feedback for real-world human action retrieval. Pattern Recognition Letters, 33(4):446–452, 2012.
[50] Y. Ke, R. Sukthankar, and M. Hebert. Efficient visual event detection using volumetric features. In IEEE International Conference on Computer Vision, volume 1, pages 166–173, 2005.
[51] S. Khalid. Motion-based behaviour learning, profiling and classification in the presence of anomalies. Pattern Recognition, 43(1):173–186, 2010.
[52] S. Khalid and S. Razzaq. Frameworks for multivariate m-mediods based modeling and classification in euclidean and general feature spaces. Pattern Recognition, 45:1092–1103, 2012.
[53] T.-K. Kim and R. Cipolla. Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 31(8):1415–1428, 2009.
[54] T.-K. Kim, K.-Y. K.Wong, and R. Cipolla. Tensor canonical correlation analysis for action classification. In IEEE Conference on Computer Vision and Pattern Recognition,
pages 1–8. IEEE, 2007.
[55] T. G. Kolda and B.W. Bader. Tensor decompositions and applications. SIAM review, 51(3):455–500, 2009.
[56] W. Kong and S. Ranganath. Signing exact english (SEE): modeling and recognition. Pattern Recognition, 41(5):1638–1652, 2008.
[57] A. Kovashka and K. Grauman. Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2046–2053. IEEE, 2010.
[58] K. Lai, J. Konrad, and P. Ishwar. A gesture-driven computer interface using kinect. In IEEE Southwest Symposium on Image Analysis and Interpretation, pages 185–188. IEEE, 2012.
[59] I. Laptev and T. Lindeberg. Space-time interest points. In IEEE International Conference on Computer Vision, volume 1, pages 432–439, 2003.
[60] I. Laptev and P. P´erez. Retrieving actions in movies. In IEEE International Conference On Computer Vision, pages 1–8, 2007.
[61] Q. V. Le, W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3361–3368. IEEE, 2011.
[62] H.-K. Lee and J. H. Kim. An HMM-based threshold model approach for gesture recognition. IEEE Transactions on pattern analysis and machine intelligence, 21(10):961–973, 1999.
[63] H. Li, J. Tang, S. Wu, Y. Zhang, and S. Lin. Automatic detection and analysis of player action in moving background sports video sequences. IEEE Transactions on Circuits and Systems for Video Technology, 20(3):351–364, 2010.
[64] Z. Li, D. Lin, and X. Tang. Nonparametric Discriminant Analysis for Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4):755–761, 2009.
[65] W. Lin, M.-T. Sun, R. Poovendran, and Z. Zhang. Group event detection with a varying number of group members for video surveillance. IEEE Transactions on Circuits and Systems for Video Technology, 20(8):1057–1067, 2010.
[66] Y.-Y. Lin, J.-F. Tsai, and T.-L. Liu. Efficient discriminative local learning for object recognition. In IEEE International Conference On Computer Vision, pages 598–605,
2009.
[67] A.-A. Liu, Y.-T. Su, P.-P. Jia, Z. Gao, T. Hao, and Z.-X. Yang. Multipe/singleview human action recognition via part-induced multitask structural learning. IEEE Transactions on Cybernetics, 45(6):1194–1208, 2015.
[68] J. Liu, B. Kuipers, and S. Savarese. Recognizing human actions by attributes. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3337–3344. IEEE, 2011.
[69] J. Liu, J. Luo, and M. Shah. Recognizing realistic actions from videos in the wild. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1996–2003, 2009.
[70] J. Liu and M. Shah. Learning human actions via information maximization. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.
[71] D. Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110, 2004.
[72] Y. M. Lui. Human gesture recognition on product manifolds. The Journal of Machine Learning Research, 13(1):3297–3321, 2012.
[73] Y. M. Lui. Tangent bundles on special manifolds for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 22(6):930–942, 2012.
[74] Y. M. Lui and J. R. Beveridge. Tangent bundle for human action recognition. In IEEE International Conference on Automatic Face & Gesture Recognition andWorkshops, pages 97–102. IEEE, 2011.
[75] Y. M. Lui, J. R. Beveridge, and M. Kirby. Action classification on product manifolds. In IEEE Conference on Computer Vision and Pattern Recognition, pages 833–839. IEEE, 2010.
[76] S. G. M. Bregonzio and T. Xiang. Recognising action as clouds of space-time interest points. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1948–1955, 2009.
[77] A. J. Ma, P. C. Yuen, W. W. Zou, and J.-H. Lai. Supervised spatio-temporal neighborhood topology learning for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 23(8):1447–1460, 2013.
[78] X. Ma, F. I. Bashir, A. A. Khokhar, and D. Schonfeld. Event analysis based on multiple interactive motion trajectories. IEEE Transactions on Circuits and Systems for Video Technology, 19(3):397–406, 2009.
[79] J. A. Mikel D. Rodriguez and M. Shah. Action mach a spatio-temporal maximum average correlation height filter for action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2008.
[80] R. Minhas, A. A. Mohammed, and Q. J. Wu. Incremental learning in human action recognition based on snippets. IEEE Transactions on Circuits and Systems for Video Technology, 22(11):1529–1541, 2012.
[81] A. Mittal and L. S. Davis. m2 tracker: a multi-view approach to segmenting and tracking people in a cluttered scene. International Journal of Computer Vision, 51(3):189–203, 2003.
[82] T. Moeslund, A. Hilton, and V. Kr¨uger. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2-3):90–126, 2006.
[83] B. Moghaddam and G. Shakhnarovich. Boosted dyadic kernel discriminants. In Advances in Neural Information Processing Systems, pages 761–768, 2002.
[84] A. Naftel and S. Khalid. Classifying spatiotemporal object trajectories using unsupervised learning in the coefficient feature space. Multimedia Systems, 12(3):227–238, 2006.
[85] A. Naftel and S. Khalid. Motion trajectory learning in the dft-coefficient feature space. In IEEE International Conference on Computer Vision Systems, pages 47–47, 2006.
[86] G. Nagendar, S. G. Bandiatmakuri, M. G. Tandarpally, and C. Jawahar. Action recognition using canonical correlation kernels. In Asian Conference on Computer Vision, pages 479–492. Springer, 2013.
[87] J. C. Nascimento, M. A. T. Figueiredo, and J. S. Marques. Trajectory classification using switched dynamical hidden markov models. IEEE Transactions on Image Processing, 19(5):1338–1348, 2010.
[88] J. Niebles, H. Wang, and L. Fei-Fei. Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3):299–318, 2008.
[89] J. C. Niebles, H. Wang, and L. Fei-Fei. Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3):299–318, 2008.
[90] S. Nowozin, G. Bakir, and K. Tsuda. Discriminative subsequence mining for action classification. In International Conference on Computer Vision, 2007.
[91] T. Ogata, W. Christmas, J. Kittler, and S. Ishikawa. Improving human activity detection by combining multi-dimensional motion descriptors with boosting. In International Conference on Pattern Recognition, pages 295–298, 2006.
[92] S. O’Hara and B. A. Draper. Scalable action recognition with a subspace forest. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1210–1217. IEEE, 2012.
[93] R. Poppe. A survey on vision-based human action recognition. Image and Vision Computing, 28(6):976–990, 2010.
[94] A. Psarrou, S. Gong, and M. Walter. Recognition of human gestures and behaviour based on motion trajectories. Image and Vision Computing, 20(5-6):349–358, 2002.
[95] C. Rao, A. Yilmaz, and M. Shah. View-invariant representation and recognition of actions. International Journal of Computer Vision, 50(2):203–226, 2002.
[96] M. Raptis, I. Kokkinos, and S. Soatto. Discovering discriminative action parts from mid-level video representations. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1242–1249. IEEE, 2012.
[97] A. Ross and A. Jain. Information fusion in biometrics. Pattern recognition letters, 24(13):2115–2125, 2003.
[98] C. Rougier, J. Meunier, A. St-Arnaud, and J. Rousseau. Robust video surveillance for fall detection based on human shape deformation. IEEE Transactions on Circuits and Systems for Video Technology, 21(5):611–622, 2011.
[99] S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1234–1241. IEEE, 2012.
[100] S. Savarese, A. DelPozo, J. C. Niebles, and L. Fei-Fei. Spatial-temporal correlatons for unsupervised action classification. In IEEE Workshop on Motion and video Computing, 2008.
[101] S. Scherer, M. Glodek, G. Layher, M. Schels, M. Schmidt, T. Brosch, S. Tschechne, F. Schwenker, H. Neumann, and G. Palm. A generic framework for the inference of user states in human computer interaction. Journal on Multimodal User Interfaces, 6(3-4):117–141, 2012.
[102] B. Sch¨olkopf, A. Smola, and K. M¨uller. Nonlinear component analysis as a kernel eigenvalue problem. Neural computation, 10(5):1299–1319, 1998.
[103] C. Sch¨uldt, I. Laptev, and B. Caputo. Recognizing human actions: A local SVM approach. In International Conference on Pattern Recognition, volume 3, pages 32–36, 2004.
[104] P. Scovanner, S. Ali, and M. Shah. A 3-dimensional sift descriptor and its application to action recognition. In International conference on Multimedia, pages 357–360, 2007.
[105] L. Shao, X. Zhen, D. Tao, and X. Li. Spatio-temporal laplacian pyramid coding for action recognition. IEEE Transactions on Cybernetics, 44(6):817–827, 2014.
[106] F. Shi, E. Petriu, and R. Laganiere. Sampling strategies for real-time action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2595–2602. IEEE, 2013.
[107] Y. Song, D. Demirdjian, and R. Davis. Continuous body and hand gesture recognition for natural human-computer interaction. ACM Transactions on Interactive Intelligent Systems, 2(1):5, 2012.
[108] Y. Song, Y.-T. Zheng, S. Tang, X. Zhou, Y. Zhang, S. Lin, and T.-S. Chua. Localized multiple kernel learning for realistic human action recognition in videos. IEEE Transactions on Circuits and Systems for Video Technology, 21(9):1193–1202, 2011.
[109] D. Sun, S. Roth, and M. J. Black. Secrets of optical flow estimation and their principles. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2432–2439. IEEE, 2010.
[110] A. Torralba, K. P. Murphy, andW. T. Freeman. Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):854–869, 2007.
[111] E. Ustunel, X. Chen, D. Schonfeld, and A. A. Khokhar. Null-space representation for view-invariant motion trajectory classification-recognition and indexing-retrieval. In
IEEE International Conference on Acoustics, Speech and Signal Processing, pages 809–812, 2008.
[112] M. Vlachos, D. Gunopoulos, and G. Kollios. Discovering similar multidimensional trajectories. In International Conference on Data Engineering, pages 673–684, 2002.
[113] H. Wang, A. Kl¨aser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3169–3176. IEEE, 2011.
[114] H.Wang, A. Kl¨aser, C. Schmid, and C.-L. Liu. Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision, 103(1):60–79, 2013.
[115] D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 104(2-3):249–257, 2006.
[116] G. Willems, T. Tuytelaars, and L. Van Gool. An efficient dense and scale-invariant spatio-temporal interest point detector. In European Conference on Computer Vision,
pages 650–663, 2008.
[117] S. Wu and Y. Li. Flexible signature descriptions for adaptive motion trajectory representation, perception and recognition. Pattern Recognition, 42(1):194–214, 2009.
[118] X. Wu, D. Xu, L. Duan, and J. Luo. Action recognition using context and appearance distribution features. In IEEE Conference on Computer Vision and Pattern Recognition, pages 489–496. IEEE, 2011.
[119] X. Wu, D. Xu, L. Duan, J. Luo, and Y. Jia. Action recognition using multilevel features and latent structural svm. IEEE Transactions on Circuits and Systems for Video Technology, 23(8):1422–1431, 2013.
[120] M. Yang, D. Dai, L. Shen, and L. Van Gool. Latent dictionary learning for sparse representation based classification. In IEEE Conference on Computer Vision and Pattern Recognition, pages 4138–4145. IEEE, 2014.
[121] M.-H. Yang, N. Ahuja, and M. Tabb. Extraction of 2D motion trajectories and its application to hand gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1061–1074, 2002.
[122] A. Yilmaz and M. Shah. Actions sketch: A novel action representation. In IEEE conference on Computer Vision and Pattern Recognition, volume 1, pages 984–989, 2005.
[123] H.-S. Yoon, J. Soh, Y. J. Bae, and H. S. Yang. Hand Gesture Recognition Using Combined Features of Location, Angle and Velocity. Pattern Recognition, 34(7):1491–1501, 2001.
[124] H. Yu and J. Yang. A direct LDA algorithm for high-dimensional data–with application to face recognition. Pattern Recognition, 34(10):2067–2070, 2001.
[125] C. Yuan, W. Hu, G. Tian, S. Yang, and H. Wang. Multi-task sparse learning with beta process prior for action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 423–429. IEEE, 2013.
[126] Z. Zhang, Y. Hu, S. Chan, and L.-T. Chia. Motion context: A new representation for human action recognition. In European Conference on Computer Vision, pages 817–829, 2008.
[127] X. Zhen, L. Shao, D. Tao, and X. Li. Embedding motion and structure features for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 23(7):1182–1190, 2013.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top