跳到主要內容

臺灣博碩士論文加值系統

(44.192.38.248) 您好!臺灣時間:2022/11/26 22:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:吳爵全
研究生(外文):Wu, Chueh-Chuan
論文名稱:基於前後影格之特徵向量為基礎的視素分析
論文名稱(外文):Viseme Clustering Based On The Variation Of Features Between Frames
指導教授:林義凱林義凱引用關係
指導教授(外文):Lin, Yih-Kai
口試委員:楊政興陳建良
口試委員(外文):Yang, Cheng-HsingChen, Chien-Liang
口試日期:2014-07-14
學位類別:碩士
校院名稱:國立屏東教育大學
系所名稱:資訊科學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:33
中文關鍵詞:語音辨識最長共同子序列視素隱藏式馬可夫模型
外文關鍵詞:speech recognitionlongest common subsequencesvisemehidden markov models
相關次數:
  • 被引用被引用:0
  • 點閱點閱:210
  • 評分評分:
  • 下載下載:6
  • 收藏至我的研究室書目清單書目收藏:0
視素(viseme)是一個發音的基本單位,為一種視覺對應的概念。找出一個語音的音節所對應的視素,可以應用在合成嘴部動畫上。近年來,對於視覺語音辨識的研究也越來越多,本篇想要探討唇形以及語音間的關係,並根據前後音的關係,進而去定義台語視素。

本論文主要分為兩個階段來進行。第一部分即為所謂前處理的部分,首先先將預備的音訊檔輸入,接著對音訊檔做標音及擷取唇形座標。第二部分則是將唇形特徵點根據前後Frame唇形的變化量將其化為字符表示,最後再利用LCS及階層式分群法來做分群的動作。
A viseme is the basic unit of pronunciation, the concept corresponding to a visual. To nd a voice corresponding syllable as the viseme can be applied in the
synthesis of the mouth animation.In recent years,There are more and more research for visual speech recognition, This paper wants to explore the relationship between
the lip and voice and tone based on the variation of features between frames,and then to de ne Taiwanese viseme.

This thesis is divided into two phases.The first part is the pre-processing part,We input the audio fi les prepared,and then do the transcription for audio fi les and retrieve lip coordinates.The Second part of the feature points based on the variation between frames it into character representation.and fi nally use LCS clustering and hierarchical clustering method to do the action.
誌謝 i
中文摘要 ii
英文摘要 iii
目錄 iv
圖目錄 vi
表目錄 vii

一、緒論 1
1.1 研究背景 1
1.2 研究方法 1
1.3 本文架構 2

二、文獻回顧 3

三、預備知識 5
3.1 隱藏式馬可夫模型(Hidden Markov Model) 5
3.2 最長共同子序列(Longest Common Subsequences) 7
3.3 階層式分群法(Hierarchical Clustering) 9
3.4 相關距離(Correlation distance) 14

四、研究方法 16
4.1 流程介紹 16

五、實驗結果 22

六、結論及未來研究方向 29

參考文獻 30

附錄A:624個台語語音音節 32
[1] J. Je ers and M. Barley. Speechreading (Lipreading). Charles C Thomas Pub Ltd,1971.
[2] C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri, S. Sison,A. Mashari, and J. Zhou, "Audio-visual speech recognition," Tech. Rep. , October.12,2000.
[3] T.J. Hazen, K. Saenko, C. H. La, and, J. R. Glass. "A seqment-based audio-visual speech recognizer. data collection, development, and initial experiments,"
in proceedings of theh international conference on Multimodal interfaces. State College,PA,USA: ACM, 2004, pp. 235-242.
[4] E. Bozkurt, E. Qigdem Eroglu, E. Erzin, T. Erdem, and M. Ozkan, Comparison of phoneme and viseme based acoustic units for speech driven realistic lip
animation," in 3DTV conference, 2007, pp. 1-4.
[5] N.C. Jones, P.A. Pevzner, "An Introduction to Bioinformatics Algorithms", The MIT Press, 2004.
[6] L. R. Rabiner,"A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,"Processdings of the IEEE, Vol.77, No.22, 1989, pp.257-286.
[7] A. S. Deshpande, D.S. Richards, and W.R. Pearson, "A Platform for BiologicalSequence Comparison on Parallel Computers," Computer Applications in the Biosciences, vol. 7, no. 2, 1991, pp. 237-247.
[8] K. M. Chao, "Dynamic-programming strategies for analyzing biomolecular sequences," Department of Computer Science and Information Engineering, National Taiwan University, February 2003.
[9] W. R. Pearson, and W. Miller, "Dynamic Programming Algorithms for Biological Sequence Comparison," Methods in Enzymology, vol. 210, 1992, pp. 575-601.
[10] W. R. Pearson, and D. J. Lipman, "Improved Tools for Biological Sequence Comparison," in Proc. Natl. Acad. Sci. USA, vol. 85, April 1988, pp. 2444-2448.
[11] A. Apostolico, S. Browne, and C. Guerra, "Fast Linear-Space Computations of Longest Common Subsequences," Theoretical Computer Science, vol. 92, no. 1,
January 1992, pp. 3-17.
[12] A. Apostolico and C. Guerra, "A Fast Linear Space Algorithm Computing Longest Common Subsequences," in Proc. of the 23rd Allerton Conf., Monticello,IL, 1985,pp. 76-84.
[13] A. Apostolico, M. J. Attalah, L. L. Larmore, and S. Mcfaddin, "Ecient Parallel Algorithms for String Editing and Related Problems," SIAM Journal on Computing, vol. 19, no. 5, October 1990, pp. 968-988.
[14] M. Lu and H. Lin, "Parallel Algorithms for the Longest Common Subsequence Problem," IEEE Trans. Parall. Distr. System, vol. 5, no. 8, August. 1994, pp.835-848.
[15] Y. C. Lin, "New systolic arrays for the longest subsequence problem," Parallel Computing, vol. 20, no. 9, September 1994, pp. 1323-1334.
[16] G. Luce and J. F. Myoupo, "An Ecient Linear Systolic Algorithm for Recovering Longest Common Subsequences," IEEE First International Conference on Algorithms and Architectures for Parallel Processing, vol. 1, April 1995, pp.20-29.
[17] C. W. Wang, "An lmplementable and Ecient Systolic Algorithm for the Longest Common Subsequence Problem, " National Taiwan University of Science andTechnology, 2006.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊