跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.175) 您好!臺灣時間:2024/12/10 16:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳宏典
研究生(外文):Hong-Dien Chen
論文名稱:可置換之語音驅動唇形合成方法
論文名稱(外文):Transferable Speech-Driven Lips Synthesis
指導教授:莊永裕
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊網路與多媒體研究所
學門:電算機學門
學類:軟體發展學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:英文
論文頁數:64
中文關鍵詞:人臉動畫
外文關鍵詞:speech animation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:119
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
以影像為基礎的人臉動畫技術已經達到很高的真實度,它可被應用在低頻寬的視訊會議或是在語言學習上扮演虛擬教師角色。可是以影像為基礎的人臉動畫技術需要先為特定使用者拍攝一段五到十分鐘的訓練影片,並加以分析來建立模型以產生動畫,這樣會限制它的應用。我們提出了一個簡單的方法,新使用者只需拍攝幾張特定的影像並利用原先使用者所建立的模型,就能產生出新的人臉動畫。
Image-based videorealistic speech animation achieves significant visual realism such that it can be potentially used for creating virtual teachers in language learning, digital characters in movies, or even user’s representatives in video conferencing under very low bit-rate. However, it comes at the cost of the collection of a large video corpus from the specific person to be animated. This requirement hinders its use in broad applications, since a large video corpus for a specific person under a controlled recording setup may not be easily obtained. Hence, we adopt a simply method which allows us to transfer original animation model to a novel person only with a few different lip images.
CHAPTER 1 INTRODUCTION 12
CHAPTER 2 RELATED WORK 15
2.1. FACIAL CODING 15
2.2. MODEL-BASED FACIAL VIDEO SYNTHESIS 16
2.3. IMAGE-BASED FACIAL VIDEO SYNTHESIS 23
CHAPTER 3 BACKGROUND: TRAINABLE VIDEOREALISTIC SPEECH ANIMATION 29
3.1. CORPUS 30
3.2. PRE-PROCESSING 30
3.3. MULTIDIMENSIONAL MORPHABLE MODELS 31
3.3.1. MMM Construction 31
3.3.2. Synthesis 32
3.3.3. Analysis 32
3.4. PHONEME MODELS 33
3.4.1. Phoneme Models Construction 33
3.4.2. Trajectory Synthesis 34
3.4.3. Training 35
3.5. POST-PROCESSING 35
CHAPTER 4 MODEL TRANSFER 36
4.1. INITIALIZATION 38
4.2. FLOW MATCHING 40
4.3. TEXTURE MATCHING 41
4.4. ANALYSIS AND SYNTHESIS 42
CHAPTER 5 EXPERIMENTAL RESULTS 43
CHAPTER 6 DISCUSSIONS AND FUTURE WORK 45
CHAPTER 7 AN APPLICATION EXAMPLE 46
REFERENCE 61
[KINGS05]Scott A. King, Richard E. Parent, “Creating Speech-Synchronized Animation”, IEEE Transactions on Visualization and Computer Graphics, vol. 11, no. 3, pp. 341-352, May/June 2005.
[CHAI03]Jin-Xiang Chai, Jing Xiao, Jessica Hodgins, “Vision-based Control of 3D Facial Animation”, proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer animation.
[Hiwada03]Kazuhiro Hiwada, Atsuto Maki, Akiko Nakashima, “Mimicking Video:Real-Time Morphable 3D Model Fitting”, proceedings of the ACM symposium on Virtual reality software and technology, 2003.
[Ezzat96]Tony Ezzat and Tomaso Poggio, “Facial Analysis and Synthesis Using Image-Based Models”, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, Vermont, October 1996.
[Ezzat97]Tony Ezzat and Tomaso Poggio, “Videorealistic Talking Faces : A Morphing Approach”, Proceedings of the Audiovisual Speech Processing Workshop, Rhodes, Greece, September 1997.
[Ezzat98]Tony Ezzat and Tomaso Poggio, “MikeTalk : A Talking Facial Display Based on Morphing Visemes”, Proceedings of the Computer Animation Conference Philadelphia, PA, June 1998.
[Ezzat99]Tony Ezzat and Tomaso Poggio, “Visual speech synthesis by morphing visemes”, In K. A. Publishers, editor, International Journal of Computer Vision, volume 38, pages 45--57, 2000.
[Ezzat02]T. Ezzat, G. Geiger, T. Poggio, “Trainable videorealistic speech animation”, ACM Trans. Graphics (also in Proc. SIGGRAPH''02) 21(3): 388 -398, 2002.
[Ezzat03]T. Ezzat, G. Geiger, T. Poggio, “Perceptual Evaluation of Video-Realistic Speech”, CBCL Paper #224/ AI Memo #2003-003, Massachusetts Institute of Technology, Cambridge, MA, February 2003.
[Ezzat05]Yao-Jen Chang and Tony Ezzat, “Transferable Videorealistic Speech Animation”, ACM Siggraph/Eurographics Symposium on Computer Animation, Los Angeles, CA 2005.
[Cosker04]Darren Cosker, Susan Paddock, David Marshall, Paul. L. Rosin, Simon Rushton, “Towards Perceptually Realistic Talking Heads: Models”, In the Proceedings of the 1st Symposium on Applied perception in graphics and visualization, 2004.
[Cao03]Yong Cao, Petros Faloutsos, and Frederic Pighin, “Unsupervised Learning for Speech Motion Editing”, Proceedings of the ACM SIGGRAPH Symposium on Computer Animation 2003.
[Cao04]Yong Cao, Petros Faloutsos, Eddie Kohler, and Frederic Pighin, “Real-time Speech Motion Synthesis from Recorded Motions”, Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2004.
[Vlasic05]Daniel Vlasic, Matthew Brand, Hanspeter Pfister, Jovan Popovic, “Face Transfer with Multilinear Models”, ACM Transactions on Graphics 24(3), 2005.
[Itti03]Laurent Itti, Nittin Dhavale, and Frederic Pighin, “Realistic Avatar Eye and Head Animation Using a Neurobiological Model of Visual Attention”, Proc. SPIE 48th Annual International Symposium on Optical Science and Technology, Aug 2003.
[BLAN99]V. Blanz and T. Vetter, “A Morphable Model for the Synthesis of 3D Faces”, SIGGRAPH 1999, pp187-194, 1999.
[BLAN03]V. Blanz, C. Basso, T. Poggio, and T. Vetter, “Reanimating faces in images and video”, Computer Graphics Forum Eurographics 2003 Conference Proceedings, Vol. 22, No. 3, p.641 - p.650, 2003..
[BRAN99]M. Brand, “Voice puppetry”, Proc. SIGGRAPH'' 99, pp.21-28, 1999.
[BREG97]Christoph Bregler, Malcolm Slaney, Michele Covell, “Video Rewrite: Speaking Through the Mouths of Others”, SIGGRAPH 1997, 1997.
[COHE93]M. M. Cohen, D.W. Massaro, “Modeling co-articulation in synthetic visual speech”, Models and Techniques in Computer Animation. Springer-Verlag press, pp.139-156, 1993.
[COTE98]G. Cote, B. Erol, M. Gallant, F. Kossentini, “Video coding at low bit rates”, IEEE Transactions on Circuit and Systems for Video Technology, Vol. 8, No. 7, 1998.
[EPF]“3-D Facial Reconstruction Uncalibrated Image Sequences”, http://vrlab.epfl.ch/research/V_head_modeling.html
[GUEN98]B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin, “Making faces”, ACM SIGGRAPH 1998 Conference Proceedings, p.55 - p.66, 1998.
[ISO97 ISO/IEC JTC1/SC29/WG11 N1902]Text for CD 14496-2 Video, November 1997.
[KALB01]G. A. Kalberer GA, L.V. Gool, “Face animation based on Observed 3D speech dynamics”, Proc. Computer Animation 2001, IEEE Computer Society, pp.18-24, Seoul, 2001.
[KAMP97]Markus Kampmann and Jorn Ostermann, “Automatic adaptation of a face model in a layered coder with an object-based analysis-synthesis layer and a knowledge-based layer”, Signal Processing: Image Communication, 9, pp. 201-220, 1997.
[KSHI03]S. Kshirsagar, N. Magnenat-Thalmann, “Visyllable based speech animation”, Computer Graphics Forum 22(3), p.631-p.639, 2003.
[MART97]Geovanni Martinez, “Shape estimation of articulated 3D objects for object-based analysis-synthesis coding (OBASC)”, Signal Processing: Image Communication, 9, pp. 175-199, 1997.
[OSTE94]Jorn Ostermann, “Object-based analysis-synthesis coding based on the source model of moving rigid 3D objects”, Signal Processing: Image Communication, 6, pp. 143-161, 1994.
[PIGH98]Frederic Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, David H. Salesin, “Synthesizing Realistic Facial Expressions from Photographs”, Proceedings of SIGGRAPH''98, 1998.
[WANG05]Jue Wang, Michael F. Cohen, “Very Low Frame-Rate Video Streaming For Face-To-Face Teleconference”, In Proceedings of Data Compression Conference (DCC''05), to appear, 2005.
[WATE87]K. Waters, “A muscle model for animating three dimensional facial expression”, ACM Computer Graphics (SIGGRAPH 1987 Conference Proceedings), Vol. 21, No. 4, p.17 - p.24, 1987.
[WEN04]Wen, Z. C. Liu, M. Cohen, J. Li, K. Zheng, T. Huang, “Low Bit-rate Video Streaming for Face-to-Face Teleconference”, In Proceedings of IEEE International Conference on Multimedia and Expo, 2004.
[ZHAN97]Liang Zhang, “Tracking a face for knowledge-based coding of videophone sequences”, Signal Processing: Image Communication, 10, pp. 93-114, 1997.
[ZHAN04]Li Zhang, Keith Noah Snavely, Brian Curless, Steve M. Seitz, “Spacetime Faces: High-Resolution Capture for Modeling and Animation”, SIGGRAPH 2004, 2004.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top