臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.14) 您好！臺灣時間：2025/12/01 22:37

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

昌芳騁

研究生(外文):

Chang Fang-Chen

論文名稱:

唇語辨識系統

論文名稱(外文):

Lipreading System

指導教授:

劉長遠

指導教授(外文):

Liou Cheng-Yuan

學位類別:

碩士

校院名稱:

國立臺灣大學

系所名稱:

資訊工程學研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

1999

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

唇語辨識

外文關鍵詞:

Lipreading、Lip Extraction、Speech Recognition、Point Distribution Model、Principal Component Analysis、Time-Delay Neural Networks

相關次數:

被引用:0
點閱:1065
評分:
下載:0
書目收藏:0

本論文建構了兩個和嘴唇相關的模型，其一是嘴唇輪廓模型，使用的方法是 Point Distribution Model (PDM) ﹔其二是嘴唇顏色模型，使用的方法和 PDM 相似，以主要成分分析（PCA）分析嘴唇附近的顏色，建構嘴唇顏色模型。我們使用一個能量函式來衡量一個落在一張影像中的嘴唇，是否符合嘴唇顏色模型所描述的嘴唇。因此，在影像中搜尋嘴唇的問題，可以轉換為最小化此能量函式的問題，Simplex Method 將被用來最小化此能量函式。在唇語辨讀方面，我們使用 Time-Delay Neural Networks 來做辨識。

中文摘要 ……………………………………………………………………… 3
英文摘要 ……………………………………………………………………… 4
1. 導論 ………………………………………………………………… 5
1.1 McGurk 效應 ……………………………………………………………… 5
1.2 唇語辨識相關之研究 …………………………………………………… 6
1.3 系統簡介 ………………………………………………………………… 6
1.4 論文概要 ………………………………………………………………… 7
2. 研究背景 …………………………………………………………… 8
1.1 特徵擷取的方法 ………………………………………………………… 8
1.2 分類器的種類 …………………………………………………………… 9
3. 嘴唇模型的建構 ………………………………………………… 10
3.1簡介 ……………………………………………………………………… 10
3.2主要成分分析 …………………………………………………………… 10
3.3 Point Distribution Models …………………………………………… 13
3.4嘴唇形狀模型 …………………………………………………………… 13
3.4.1 手動標示嘴唇形狀訓練範例 …………………………………… 14
3.4.2 對齊訓練範例中的嘴唇形狀 …………………………………… 14
3.4.3 以主要成分分析來分析訓練範例 ……………………………… 15
3.4.4 嘴唇形狀模型 …………………………………………………… 15
3.5 嘴唇顏色模型 …………………………………………………………… 19
3.5.1 嘴唇顏色的取樣方法 …………………………………………… 19
3.5.2 以主要成分分析來分析嘴唇顏色 ……………………………… 19
3.5.3 嘴唇顏色模型 …………………………………………………… 20
4. 尋找嘴唇位置與唇形追蹤 ……………………………………… 21
4.1 簡介 ……………………………………………………………………… 21
4.2 能量函式 ………………………………………………………………… 21
4.2.1 能量函式一 ……………………………………………………… 23
4.2.2 能量函式二 ……………………………………………………… 23
4.3 影像中嘴唇輪廓的追蹤 ………………………………………………… 23
5. 唇語辨讀 ………………………………………………………… 25
5.1 Time Delay Neural Networks 架構 …………………………………… 25
5.2 網路輸入值的取得 ……………………………………………………… 26
6. 結論與討論 ……………………………………………………… 28
7. 參考資料 ………………………………………………………… 29

1. [Basu98] S. Basu, N. Oliver, A. Pentland, " 3D modeling and tracking of human lip motions ", Sixth International Conference on Computer Vision, 1998, pp. 337 -343
2. [Bregler93] Christoph Bregler, Hermann Hild, Stefan Manke, and Alex Waibel, “Improving connected letter recognition by Lipreading”, Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Minneapolis, 1993, Vol. 1, pp.557-560.
3. [Bruce92] Vicki Bruce, “What the human face tells the human mind: Some challenges for the robot-human interface”, IEEE International Workshop on Robot and Human Communication, 1992, pp. 44-51.
4. [Carraro89] A. Carraro, E. Chilton, H. McGurk, “A Telephonic Lipreading Device for the Hearing Impaired”, IEE Colloquium on Biomedical Applications of Digital Signal Processing, 1989, pp. 1-8
5. [Chen95a] Tsuhan Chen, Yao Wang, H. P. Graf, C. Swain, "A new frame interpolation scheme for talking head sequences ", Proc. International Conference on Image Processing, 1995, vol. 2, Pp.591 -594
6. [Chen95b] Tsuhan Chen, H. P. Graf, B. Haskell, E. Petajan, Yao Wang, H. Chen, Wu Chou, "Speech-assisted lip synchronization in audio-visual communications ", Proc. International Conference on Image Processing, 1995, vol. 2, pp.579 -582
7. [Chen97] Tsuhan Chen , "Recent development in multimedia signal processing: a review on audio-visual interaction ", 13th International Conference on Digital Signal Processing Proceedings (DSP 97), vol. 1, pp.175 -178
8. [Chen98] Tsuhan Chen; Rao, R.R. , "Audio-visual integration in multimodal communication ", Proceedings of the IEEE Volume: 86 5 , May 1998 , pp.837 -852
9. [Chiou97] Greg I. Chiou and Jenq-Neng Hwang, “Lipreading from Color Video”, IEEE Transactions on Image Processing, Vol. 6, No.8, August 1997, pp.1192-1195
10. [Cootes95] T.F. Cootes, C.J. Taylor, D.H. Cooper, J. Graham, “Active shape models — Their traning and application”, Computer Vision and Image Understanding 61, 1995, pp.38-59.
11. [Edward96] T. Edward, Jr. Auer, E. Bernstein Lynne, “Lipreading Supplemented by Voice Fundamental Frequency: To What Extent Does The Addition of Voicing Increase Lexical Uniqueness for the Lipreader?”, Proceedings of Fourth International Conference on Spoken Language, vol.1, 1996, pp.86-89.
12. [Essa94] Irfan A. Essa, Trevor Darrell and Alex Pentland, “Tracking Facial Motion”, Proceedings of the IEEE Workshop on Nonrigid and Articulate Motion, Austin, Texas, November 1994
13. [Goldschen95] Alan J. Goldschen, Oscar N. Garcia, Eric Petajan, “Continuous Optical Automatic Speech Recognition by Lipreading”, IEEE Signals, Systems and Computers, vol. 1, 1994, pp.572-577.
14. [Grant91] P. M. Grant, “Speech recognition techniques”, Electronics & Communication Engineering Journal, Feb. 1991, pp. 37-48.
15. [Green96] K. P. Green, " Studies of the McGurk effect: implications for theories of speech perception ", Proc., Fourth International Conference on Spoken Language, 1996, vol. 3, pp.1652 -1655
16. [Hampshire90] John B. Hampshire, II, and Alexander H. Waibel, “A Novel Objective Function for Improved Phoneme Recognition Using Time-Delay Neural Networks”, IEEE Transactions on Neural Networks, Vol. 1, No. 2, June 1990.
17. [Huang98] Chung-Lin Huang, Wen-Yi Huang, “Sign language recognition using model-based tracking and a 3D Hopfield neural network”, Machine Vision and Applications (1998), 10, pp. 292-307
18. [Jolliffe86] I. T. Jolliffe, “Principal Component Analysis”, Springer-Verlag, 1986.
19. [Juang91] B. H. Juang, L. R. Rabiner, “Hidden Markov Models for Speech Recognition”, American Statistical Association and the American Society for Quality Control, TECHNOMETRICS, August 1991, Vol.33, No. 3, pp.251-272.
20. [Lavagetto97] Fabio Lavagetto, “Time-Delay Neural Networks for Estimating Lip Movements From Speech Analysis: A Useful Tool in Audio-Video Synchronization”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 7, No. 5, Oct. 1997, pp. 786-800.
21. [Luettin] Juergen Luettin, Neil A. Thacker and Steve W. Beet, “Visual Speech Recognition Using Active Shape Models and Hidden Markov Models”,
22. [Luettin96] Juergen Luettin, Neil A. Thacker, and Steve W. Beet, “Locating and tracking facial speech features”, Proceedings of the International Conference on Pattern Recognition (ICPR'96), 1996
23. [Luettin97] Juergen Luettin, “Towards Speaker Independent Continuous Speechreading”, Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH'97), 1997
24. [Luettin98] Juergen Luettin, Neil A. Thacker, “Speechreading using Probabilistic Models”, Computer Vision and Image Understanding, Vol 65, No. 2, February, pp.163-178, 1998
25. [Mak94] M. W. Mak, W. G. Allen, “A lip-tracking system based on morphological processing and block matching techniques”, Signal Processing: Image Communication 6 (1994), pp. 335-348.
26. [Mase91] K. Mase and A. Pentland, “Automatic lipreading by optical-flow analysis,” Syst. Comput. Jpn., vol 22, pp.67-76, 1991
27. [McGurk76] H. McGurk, J. MacDonald, “Hearing lips and seeing voices”, Nature, 264, 1976, pp.746-768
28. [Moghaddam95] Baback Moghaddam, Alex Pentland, “Probabilistic Visual Learning for Object Detection”, IEEE Proceedings, International Conference on Computer Vision, 1995, pp. 786-793.
29. [Murase96] Hiroshi Murase, Rie Sakai, “Moving object recognition in eigenspace representation: gait analysis and lip reading”, Pattern Recognition Letters 17 (1996), pp.155-162.
30. [Petajan96] E. Petajan, H. P. Graf, "Robust face feature analysis for automatic speechreading and character animation ", Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, 1996, pp.357 -362
31. [Rabi97] Gihad Rabi, Si Wei Lu, “Energy Minimization for Extracting Mouth Curves in a Facial Image”, IEEE Proceedings on Intelligent Information Systems 1997, (ISS '97) , pp. 381 —385.
32. [Rabiner89] Lawrence R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition”, Proceedings of the IEEE, vol. 772, Feb, 1989, pp. 257-286.
33. [Rao94] Ram R. Rao, Russel M. Mersereau, “Lip Modeling for Visual Speech Recognition”, IEEE Signals, Systems and Computers, 1994. 1994 Conference Record of the Twenty-Eighth Asilomar Conference, vol. 1, 1994, pp. 587-590.
34. [Silsbee96] Peter L. Silsbee, Alan C. Bovik, “Computer Lipreading for Improved Accuracy in Automatic Speech Recognition”, IEEE Transactions on Speech and Audio Processing, Vol. 4, No. 5, Sep. 1996.
35. [Sozou95] P.D. Sozou, T.F. Cootes, C.J. Taylor, E.C. Di Mauro, “A non-linear generalisation of PDMs using polynomial regression”, Image and Vision Computing 13 (5), 1995, pp. 451-457.
36. [Terzopoulos93] Demetri Terzopoulos, Keith Waters, “Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, No. 6, June 1993, pp. 569-579
37. [Waibel89] Alexander Waibel, Toshiyuki Hanazawa, Geoffrey Hinton, Kiyohiro Shikano, and Kevin J. Lang, “Phoneme Recognition Using Time-Delay Neural Networks”, IEEE Transactions on Acoustics, Speech, and Signal Processing. Vol. 37, No. 3, March 1989.
38. [Yang96] J. Yang and A. Waibel, ``A real-time face tracker," Proceedings of WACV'96 , pp. 142-147 (Sarasota, Florida, USA)
39. [Yoshikawa96] Yoshikawa, H.; Yokosato, J.; Tanaka, S. "Synthesizing human motion in a CG authoring environment for nonprofessionals ", Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems, 1996, pp.40 -43
40. [Yu97] Keren Yu, Xiaoyi Jiang, Horst Bunke, “Lipreading: A classifier combination approach”, Pattern Recognition Letters 18 (1997), pp. 1421-1426.

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	語音辨認中基於主成份分析之進一步技術
2.	以模型式影像編碼建構讀唇系統及唇形同步系統
3.	可辨識運動姿態的時空延遲網路及其在唇語辨識之應用

1.	林瑞明：〈日本統治下的臺灣新文學運動--文學結社及其精神〉，《文訊月刊》，29 ，1987.4。
2.	廖仁義：〈告子思想之辯解與批判〉，《鵝湖》，3：8，1978.2。
3.	楊儒賓：〈人性、歷史契機與社會實踐--從有限的人性論看牟宗三的社會哲學〉，《臺灣社會研究季刊》，1：4，1988冬季號。
4.	蔣年豐：〈海洋文化的儒學如何可能〉，《中國文化月刊》，1990.2。
5.	周質平：〈胡適與馮友蘭〉，《漢學研究》，9：2，1991.12。
6.	楊祖漢：〈勞思光《中國哲學史》的檢討〉，《鵝湖》，17：6（198），1991.12。
7.	廖仁義：〈臺灣觀點的「中國哲學研究」--《孔子哲學評論》與張深切的哲學思想〉，《臺灣文藝》，133，1992.11。
8.	林聰舜：〈帝國意識型態的建立--董仲舒儒學〉，《大陸雜誌》，91：2，1995.8。
9.	陳芳明：〈深切--張深切的文學專輯〉，《聯合文學》，13：7，1997.5。
10.	林安梧：〈張深切的「臺灣性」與「中國性」及相關之問題〉，《鵝湖》，278，1998.12。
11.	孫長祥：〈《墨子》中有關科學典範的形構--墨辯與作為典範的法儀〉，《哲學雜誌》，28，1999.5。
12.	李賢中：〈《墨經》中自然科學的思想方法〉，《哲學雜誌》，28，1999.12。

1.	深度學習之唇語辨識系統
2.	新型唇語辨識系統
3.	基於隱藏式馬可夫模型之唇語辨識系統
4.	數字唇語之辨識與應用
5.	可辨識運動姿態的時空延遲網路及其在唇語辨識之應用
6.	基於稀疏表示之人臉驗證與唇語辨識系統
7.	注音符號唇語辨識系統之研製
8.	以特徵參數抽取為基礎之類神經唇語辨識器
9.	基於唇部特徵點座標差之中文唇語識別系統
10.	一個基於ResNet-GRU的中文日常生活對話之唇語辨識模型
11.	唇語辨識系統之研製
12.	以模型式影像編碼建構讀唇系統及唇形同步系統
13.	嘴唇偵測使用自組網路與加強式學習
14.	具有隱匿授權子集特性之一般化群體導向聯合數位簽章
15.	數位影像浮水印演算法之實作與改進

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室