跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.173) 您好!臺灣時間:2024/12/07 12:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:朱陸中
研究生(外文):Lu-Jong Chu
論文名稱:多視角三維人體姿態追蹤–利用柔性關節規範之疊代最近點演算法
論文名稱(外文):Multiview 3D Human Motion Tracking with Soft-Joint Constrained ICP
指導教授:洪一平洪一平引用關係
指導教授(外文):Yi-Ping Hung
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:英文
論文頁數:66
中文關鍵詞:人體姿態追蹤粒子濾波器姿勢估測疊代最近點演算法多重視角三維人體模型容積重建
外文關鍵詞:human motion trackingparticle filteringpose estimationICPmultiview3D human modelvolume reconstruction
相關次數:
  • 被引用被引用:1
  • 點閱點閱:457
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文的研究目的是要從相機所觀測到的影像序列中追蹤人體的姿態,並且沒有限定欲追蹤的姿態種類(如走路、跑步…等),亦即被觀測者的動作不受到任何的限制。為了解決單一視角觀測容易發生自我遮蔽、與缺乏深度資訊產生姿勢估測模稜兩可的情形,我們透過多台攝影機取得多視角的影片,並建立三維人體容積,如此可有效地整合多視角的資訊。

由於人體的眾多關節具有極高的自由度,我們提出了一個階層式的人體姿態追蹤方法。在每個時間點先估測出軀幹姿態後,再進行四肢姿態的估測。我們採用廣泛被應用於高維度追蹤的粒子濾波器作為最難估測的軀幹姿態之追蹤方法,原因是粒子濾波器的好處在於能描述非線性及多極值的後測機率分布。然而對於階層式的人體姿態追蹤,其缺點在於軀幹姿態估測的正確性會連動地影響四肢的估測結果。為了降低軀幹誤估對四肢姿態估測的影響,我們採用結合了柔性關節規範之疊代最近點演算法。柔性關節規範允許四肢能脫離固定關節的局限,能在關節附近範圍移動,減少受軀幹姿態誤估的干擾。疊代最近點演算法則能將四肢使用柔性關節時需要的 7 個維度粒子濾波器,減少至只需決定肘部或膝部關節角度的 1 個自由度。對於四肢姿態追蹤,同時具備了粒子濾波器與柔性關節規範之疊代最近點演算法優點,使得我們的方法即使在四肢短時間內做出高速的動作時,仍能獲得有效的追蹤結果。

我們亦發現人體軀幹的方向與四肢的姿態具有相當之關連性,當我們知道四肢關節的位置時,通常就能預估出軀幹的姿態,尤其當我們擁有可靠的四肢動作資訊時。為了提高軀幹追蹤的正確性,我們藉由前一個時間點所估測之四肢柔性關節位置,進而預測目前時間點之軀幹姿態,使得粒子濾波器能有更可靠的估測依據。整合軀幹與四肢的姿態追蹤結果,我們提供了三維人體姿態追蹤問題一個有效的解決方法。
In this thesis, we aim to track 3D human motions in image sequences captured from multiple cameras. The target motion is not limited to specific kinds of human motions, such as walking or jogging, that is, there is no restrictions imposed on possible human motions. Because self-occlusion and depth ambiguity occur easily when using only one single camera, we obtain multiple videos captured with multiple cameras from different viewpoints to reconstruct 3D shape volume of the target subject, which is an effective way to integrate information from multiple views.

We propose a hierarchical human motion tracking method that can effectively capture human articulated motions with high degrees of freedom (DOFs). At each time step, the torso motion is estimated first and then the estimation of the limbs motions is carried out individually. The particle filtering, which is a popular method for high dimensional tracking, is adopted to track the torso motion because it can deal with the nonlinear and multimodal posterior probability distributions.

One disadvantage of hierarchical human motion tracking is that torso tracking errors may deteriorate limbs motion estimation. To reduce the interference from inaccurate torso motions, we propose a soft-joint constrained ICP (Iterative Closest Point) method to estimate limb motions. In contrast to hard joints, limbs with soft joints are allowed to move freely in a small range of area, so it is still possible to track limb motions even with inaccurate torso motions. However, the DOFs of each limb increase from 4 to 7 when the soft-joint constraint is used. The proposed soft-joint constrained ICP can efficiently determines 6 DOFs such that only 1 DOF (elbow/knee) is left for the particle filtering. Integrating the advantages of particle filtering and soft-joint constrained ICP at the same time, our method can effectively track limb motions even when there is large motion in a short period of time.

Moreover, we find that the torso motion is strongly related to the limbs motions. If the states of the four limbs are known, it is usually possible to predict the torso state without other information, especially when the limbs states are reliable. In order to improve torso motion tracking, the limbs motions estimated at the previous time step can provide reliable hypotheses of current torso state which is implemented as sampling particles from limbs states for torso tracking. We have conducted experiments with multiple video sequences of different motions, and the results show that our method is effective and reliable for 3D human motion tracking.
口試委員會審定書 iii
誌謝 v
摘要 vii
Abstract ix
List of Figures xiii
1 Introduction 1
1.1 Problem and Challenges 1
1.2 Proposed Method 3
1.3 Overview of Our Method 5
1.4 Outline of the Thesis 6
2 RelatedWorks 7
2.1 Model-Free vs. Model-Based 8
2.2 Single View vs. Multiple View 9
2.3 Image-Based Localization vs. Video-Based Tracking 11
2.3.1 Kalman Filtering vs. Particle Filtering 12
2.3.2 Advanced Particle Filtering 12
2.3.3 Hierarchical Particle Filtering 14
3 Model-Based 3D Human Motion Tracking 15
3.1 3D Human Model 15
3.1.1 Figure Parameters 16
3.1.2 Motion Parameters 18
3.2 3D Volume Reconstruction 20
3.2.1 Introduction to Volume-Based Visual Hull Construction 21
3.2.2 Implementation to Voxel-based Approach 23
3.3 Particle Filter Tracking 25
3.3.1 General Particle Filtering 25
3.3.2 Hierarchical Particle Filtering 29
4 Soft-Joint Constrained ICP and Torso Prediction 33
4.1 Introduction to ICP 34
4.2 Soft-Joint Constrained ICP 36
4.3 Voxel Labeling 41
4.4 Torso Prediction with Soft Joint Locations 44
5 Experiments 47
6 Conclusions and FutureWorks 59
6.1 Conclusions 59
6.2 Future Works 59
Bibliography 61
[1] A. Agarwal and B. Triggs. Tracking articulated motion using a mixture of autoregressive models. In European Conference on Computer Vision, 2004.
[2] A. Agarwal and B. Triggs. Recovering 3d human pose from monocular images. Transactions on Pattern Analysis and Machine Intelligence, 28(1):44–58, 2006. ISSN 0162-8828. doi: 10.1109/TPAMI.2006.21.
[3] K. S. Arun, T. S. Huang, and S. D. Blostein. Least-squares fitting of two 3-d point sets. Transactions on Pattern Analysis and Machine Intelligence, 9(5):698–700, 1987.
[4] M. Bray, E. Koller-Meier, and L. V. Gool. Smart particle filtering for highdimensional tracking. Computer Vision and Image Understanding, 106(1):116–129, 2007.
[5] C. Bregler and J. Malik. Tracking people with twists and exponential maps. In J. Malik, editor, Conference on Computer Vision and Pattern Recognition, pages 8–15, 1998. doi: 10.1109/CVPR.1998.698581.
[6] A. O. Balan and M. J. Black. An adaptive appearance model approach for modelbased articulated object tracking. In M. Black, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages 758–765, 2006. doi: 10.1109/CVPR.2006.52.
[7] W.-Y. Chang, C.-S. Chen, and Y.-P. Hung. Appearance-guided particle filtering for articulated hand tracking. In C.-S. Chen, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages 235–242 vol. 1, 2005. doi: 10.1109/CVPR.2005.72.
[8] G. K. Cheung, T. Kanade, J.-Y. Bouguet, and M. Holler. A real time system for robust 3d voxel reconstruction of human motions. In T. Kanade, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages 714–720 vol.2, 2000. doi: 10.1109/CVPR.2000.854944.
[9] G. K. M. Cheung, S. Baker, and T. Kanade. Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In Conference on Computer Vision and Pattern Recognition, volume 1, pages I–77–I–84 vol.1, 2003. doi: 10.1109/CVPR.2003.1211340.
[10] Q. Delamarre and O. Faugeras. 3d articulated models and multi-view tracking with silhouettes. In O. Faugeras, editor, International Conference on Computer Vision, volume 2, pages 716–721 vol.2, 1999. doi: 10.1109/ICCV.1999.790292.
[11] J. Deutscher, B. North, B. Bascle, and A. Blake. Tracking through singularities and discontinuities by random sampling. In International Conference on Computer Vision, volume 2, pages 1144–1149 vol.2, 1999. doi: 10.1109/ICCV.1999.790409.
[12] J. Deutscher, A. Blake, and I. Reid. Articulated body motion capture by annealed particle filtering. In A. Blake, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages 126–133 vol.2, 2000. doi: 10.1109/CVPR.2000.854758.
[13] J. Deutscher, A. Davison, and I. Reid. Automatic partitioning of high dimensional search spaces associated with articulated body motion capture. In A. Davison, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages II–669–II–676 vol.2, 2001. doi: 10.1109/CVPR.2001.991028.
[14] P. F. Felzenszwalb and D. P. Huttenlocher. Pictorial structures for object recognition. International Journal of Computer Vision, 61(1):55–79, 2005.
[15] M. Fontmarty, F. Lerasle, and P. Danes. Data fusion within a modified annealed particle filter dedicated to human motion capture. In F. Lerasle, editor, International Conference on Intelligent Robots and Systems, pages 3391–3396, 2007. doi: 10.1109/IROS.2007.4399521.
[16] T. X. Han, H. Ning, and T. S. Huang. Efficient nonparametric belief propagation with application to articulated body tracking. In Conference on Computer Vision and Pattern Recognition, volume 1, pages 214–221, 2006. doi: 10.1109/CVPR.2006.108.
[17] J.-M. Hasenfratz, M. Lapierre, and F. Sillion. A real-time system for full body interaction with virtual worlds. In Eurographics Symposium on Virtual Environments, pages 147–156, 2004. URL http://artis.imag.fr/Publications/2004/HLS04.
[18] S. Hou, A. Galata, F. Caillette, N. Thacker, and P. Bromiley. Real-time body tracking using a gaussian process latent variable model. In A. Galata, editor, International Conference on Computer Vision, pages 1–8, 2007. doi: 10.1109/ICCV.2007.4408946.
[19] W. Hu, T. Tan, L. Wang, and S. Maybank. A survey on visual surveillance of object motion and behaviors. 34(3):334–352, 2004. ISSN 1094-6977. doi: 10.1109/TSMCC.2004.829274.
[20] G. Hua, M.-H. Yang, and Y. Wu. Learning to estimate human pose with data driven belief propagation. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 747–754 vol. 2, 2005. doi: 10.1109/CVPR.2005.208.
[21] S. Ioffe and D. A. Forsyth. Probabilistic methods for finding people. International Journal of Computer Vision, 43(1):45–68, 2001.
[22] M. Isard and A. Blake. Icondensation: Unifying low-level and high-level tracking in a stochastic framework. In European Conference on Computer Vision, pages 1–16, 1998.
[23] M. Isard and A. Blake. Condensation – conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1):5–28, 1998.
[24] I. Kakadiaris and D. Metaxas. Model-based estimation of 3d human motion. Transactions on Pattern Analysis and Machine Intelligence, 22(12):1453–1459, 2000. ISSN 0162-8828. doi: 10.1109/34.895978.
[25] R. Kehl, M. Bray, and L. V. Gool. Full body tracking from multiple views using stochastic sampling. In M. Bray, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages 129–136 vol. 2, 2005. doi: 10.1109/CVPR.2005.165.
[26] K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis. Real-time foregroundbackground segmentation using codebook model. Real-Time Imaging, 11(3):172–185, 2005.
[27] M. W. Lee and I. Cohen. Proposal maps driven mcmc for estimating human body pose in static images. In Conference on Computer Vision and Pattern Recognition, volume 2, pages II–334–II–341 Vol.2, 2004. doi: 10.1109/CVPR.2004.1315183.
[28] R. Li, M.-H. Yang, S. Sclaroff, and T.-P. Tian. Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers. In European Conference on Computer Vision, 2006.
[29] J. MacCormick and M. Isard. Partitioned sampling, articulated objects, and interface-quality hand tracking. In European Conference on Computer Vision, 2000.
[30] W. Matusik, C. Buehler, and L. McMillan. Polyhedral visual hulls for real-time rendering. In Proceedings of the 12th Eurographics Workshop on Rendering Techniques, pages 115–126. Springer-Verlag, 2001.
[31] B. Michoud, E. Guillou, and S. Bouakaz. Shape from silhouette: Towards a solution for partial visibility problem. In Eurographics Short Papers Preceedings, 2006.
[32] B. Michoud, E. Guillou, H. Briceno, and S. Bouakaz. Real-time marker-free motion capture from multiple cameras. In E. Guillou, editor, International Conference on Computer Vision, pages 1–7, 2007. doi: 10.1109/ICCV.2007.4408991.
[33] I. Mikic, M. Trivedi, E. Hunter, and P. Cosman. Human body model acquisition and tracking using voxel data. International Journal of Computer Vision, 53(3):199–223, 2003.
[34] T. B. Moeslund and E. Granum. A survey of computer vision-based human motion capture. Computer Vision and Image Understanding, 81(3):231–268, 2001.
[35] T. B. Moeslund, A. Hilton, and V. Kruger. A survey of advances in vision-based human. motion capture and analysis. Computer Vision and Image Understanding, 104(2):90–126, 2006.
[36] G. Mori and J. Malik. Recovering 3d human body configurations using shape contexts. Transactions on Pattern Analysis and Machine Intelligence, 28(7):1052–1062, 2006. ISSN 0162-8828. doi: 10.1109/TPAMI.2006.149.
[37] G. Mori, X. Ren, A. A. Efros, and J. Malik. Recovering human body configurations: combining segmentation and recognition. In Conference on Computer Vision and Pattern Recognition, volume 2, pages II–326–II–333 Vol.2, 2004. doi: 10.1109/CVPR.2004.1315182.
[38] L. Mundermann, S. Corazza, and T. P. Andriacchi. Accurately measuring human movement using articulated icp with soft-joint constraints and a repository of articulated models. In S. Corazza, editor, Conference on Computer Vision and Pattern Recognition, pages 1–6, 2007. doi: 10.1109/CVPR.2007.383302.
[39] R. Poppe. Vision-based human motion analysis: An overview. Computer Vision and Image Understanding, 108:4–18, 2007.
[40] D. Ramanan, D. A. Forsyth, and A. Zisserman. Tracking people by learning their appearance. Transactions on Pattern Analysis and Machine Intelligence, 29(1):65–81, 2007. ISSN 0162-8828. doi: 10.1109/TPAMI.2007.250600.
[41] L. Raskin, M. Rudzsky, and E. Rivlin. Tracking and classifying of human motions with gaussian process annealed particle filter. In Asian Conference on Computer Vision, 2007.
[42] X. Ren, A. C. Berg, and J. Malik. Recovering human body configurations usingpairwise constraints between parts. In International Conference on Computer Vision, volume 1, pages 824–831 Vol. 1, 2005. doi: 10.1109/ICCV.2005.204.
[43] R. Ronfard, C. Schmid, and B. Triggs. Learning to parse pictures of people. In European Conference on Computer Vision, 2002.
[44] S. Rusinkiewicz and M. Levoy. Efficient variants of the icp algorithm. In M. Levoy, editor, Proc. Third International Conference on 3-D Digital Imaging and Modeling, pages 145–152, 2001. doi: 10.1109/IM.2001.924423.
[45] H. Sidenbladh, M. J. Black, and D. J. Fleet. Stochastic tracking of 3d human figures using 2d image motion. In European Conference on Computer Vision, 2000.
[46] L. Sigal, S. Bhatia, S. Roth, M. J. Black, and M. Isard. Tracking loose-limbed people. In Conference on Computer Vision and Pattern Recognition, volume 1, pages I–421–I–428 Vol.1, 2004. doi: 10.1109/CVPR.2004.1315063.
[47] C. Sminchisescu and B. Triggs. Covariance scaled sampling for monocular 3d body tracking. In B. Triggs, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages I–447–I–454 vol.1, 2001. doi: 10.1109/CVPR.2001.990509.
[48] C. Sminchisescu and B. Triggs. Estimating articulated human motion with covariance scaled sampling. International Journal of Robotics Research, 22:371–393, 2003.
[49] C. Stauffer and W. Grimson. Adaptive background mixture models for real-time tracking. In Conference on Computer Vision and Pattern Recognition, volume 2, pages –252 Vol. 2, 1999. doi: 10.1109/CVPR.1999.784637.
[50] J. Sun, W. Zhang, X. Tang, and H.-Y. Shum. Background cut. In European Conference on Computer Vision, 2006.
[51] R. Urtasun, D. J. Fleet, and P. Fua. Monocular 3d tracking of the golf swing. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 932–938 vol. 2, 2005. doi: 10.1109/CVPR.2005.229.
[52] R. Urtasun, D. J. Fleet, A. Hertzmann, and P. Fua. Priors for people tracking from small training sets. In D. Fleet, editor, International Conference on Computer Vision, volume 1, pages 403–410 Vol. 1, 2005. doi: 10.1109/ICCV.2005.193.
[53] L. Wang, W. Hu, and T. Tan. Recent developments in human motion analysis. Pattern Recognition, 36(3):585–601, 2003.
[54] P. Wang and J. M. Rehg. A modular approach to the analysis and evaluation of particle filters for figure tracking. In J. Rehg, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages 790–797, 2006. doi: 10.1109/CVPR.2006.32.
[55] X. Xu and B. Li. Learning motion correlation for tracking articulated human body with a rao-blackwellised particle filter. In B. Li, editor, International Conference on Computer Vision, pages 1–8, 2007. doi: 10.1109/ICCV.2007.4408951.
[56] M. Yamamoto, A. Sato, S. Kawada, T. Kondo, and Y. Osaki. Incremental tracking of human actions from multiple views. In Conference on Computer Vision and Pattern Recognition, pages 2–7, 1998. doi: 10.1109/CVPR.1998.698580.
[57] J. Zhang. Statistical modeling and localization of nonrigid and articulated shapes. Technical report, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, March 2006.
[58] J. Zhang, R. Collins, and Y. Liu. Representation and matching of articulated shapes. In Conference on Computer Vision and Pattern Recognition, volume 2, pages II–342–II–349 Vol.2, 2004. doi: 10.1109/CVPR.2004.1315184.
[59] J. Zhang, J. Luo, R. Collins, and Y. Liu. Body localization in still images using hierarchical models and hybrid search. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 1536–1543, 2006. doi: 10.1109/CVPR.2006.72.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top