跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.87) 您好!臺灣時間:2024/12/03 01:55
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:楊永泰
研究生(外文):Yung-Tai,Yang
論文名稱:隱藏式馬可夫模型應用於中文語音辨識之研究
論文名稱(外文):The Research of Hidden Markov Model Applied on Mandarin Recognition
指導教授:杜筑奎
指導教授(外文):Chu-Kuei,Tu
學位類別:碩士
校院名稱:中原大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2000
畢業學年度:88
語文別:中文
論文頁數:54
中文關鍵詞:數位訊號處理隱藏式馬可夫模型語者調適貝氏調適
外文關鍵詞:digital signal processingHMMspeaker-adaptiveBayesian adaptation
相關次數:
  • 被引用被引用:43
  • 點閱點閱:3070
  • 評分評分:
  • 下載下載:225
  • 收藏至我的研究室書目清單書目收藏:1
由於電腦功能的增強以及網際網路的盛行,使得許多工作可以輕易的藉由電腦來完成。因此利用語音做為電腦的輸入方式,使用者不必強記任何規則,使中文輸入變得為更為簡單容易,將可望大幅縮短人與電腦之間的距離,故語音訊號之電腦處理研究亦成為研究之重點。
語者調適(Speaker-adaptive)辨識系統可藉由輸入新語者少量語音資料進行訓練,而可達到接近特定語者系統的辨識率。本論文研究利用數位訊號處理技術取得語音訊號之特徵參數,以連續型隱藏式馬可夫模型(Continuous Density Hidden Markov Model : CDHMM)為主的語音辨識演算法,並嘗試以貝氏調適(Bayesian-adaptation)技術應用於非特定語者調適系統的訓練上。
特徵參數選取方面,以梅爾(mel)頻譜係數為主,應用連續型隱藏式馬可夫模型,分別建立各個語音的模型。辨識方面是採用維特比演算法求出最佳之機率結果。語者調適方面,選取適當的已知模型為基礎參考模型,將新語者的語音樣本以貝氏調適的作法結合在K路分割(Segmental k-means)訓練演算法進行調適,將可提高辨識系統對新語者之辨識率。
此系統於個人電腦上的視窗Windows 98作業系統下進行測試與實驗,建立一確實可行之語音辨識系統。
The functions of the computer and network are more prevalently, so many jobs can be finished easily by the computer. Using speech as computer input, the users need not memorize any rules to use the computer. It''''s more friendly for those people who doesn''''t experienced in computer, so they can communicate with computer easily, therefore speech signal processing becomes an important topic of research.
A speaker-adaptive system makes use of existing knowledge contained in a reliable trained reference system, so that a small amount of training data is sufficient to reach performance of the speaker-dependent system. So, in this thesis, speech feature parameter extraction by using digital signal processing technology was studied. The CDHMM was used as the basis of speech recognition system, and the Bayesian adaptation technique was used in a speaker-independent system.
During the feature parameter extraction stage, the mel spectrum was used to evaluate feature parameter of the speech. Using the CDHMM established all models of Mandarin. Viterbi algorithm is a recognition procedure to find the best result of probability in HMM. In the speaker-adaptive stage, the suitable determinate model was chosen as the basic referential model of the system, and the Bayesian training algorithm was integrated into the segmental k-means training procedure, it will promote the performance of the speaker-adaptive system when a new speaker uses it.
Finally, an actually realized speech recognition system was established after training and testing under windows 98 operating system environment.
第一章 緒論....................................1
1.1研究動機與目的...............................1
1.2研究方法.....................................2
1.3各章節內容概述...............................3
第二章 語音訊號辨識與調適的理論基礎............5
2.1 語音訊號的前置處理..........................5
2.1.1 能量量測..................................5
2.1.2 越零率....................................7
2.1.3 語音信號切割..............................8
2.2 語音訊號的特徵及特徵參數抽取...............10
2.2.1 梅爾(mel)倒頻譜參數......................11
2.3 隱藏式馬可夫模型...........................15
2.4 隱藏式馬可夫模型之建立.....................16
2.4.1 機率計算.................................18
2.4.2 正算程序(Forward Procedure)..............19
2.4.3 逆算程序(Backward Procedure).............21
2.5 維特比演算法(Viterbi Algorithm)............23
2.6 參數重估(Parameter Reestimation)...........27
2.7 語者調適...................................30
2.7.1 貝氏調適法(Bayesian Adapation)...........30
第三章 語音訊號辨識與調適系統的建立...........34
3.1語音訊號特徵參數抽取........................34
3.1.1 梅爾頻譜.................................34
3.2建立語音訊號之隱藏式馬可夫模型..............36
3.3隱藏式馬可夫模型辨識程序....................37
3.4語者調適系統的建立..........................38
第四章 實驗結果與比較.........................40
4.1 語音樣本數對辨識系統的影響.................40
4.2 隱藏式馬可夫模型的狀態數對辨識系統的影響...45
4.3 語者調適系統...............................47
第五章 結論與未來展望.........................49
5.1 結論.......................................49
5.2 未來展望...................................51
參考文獻.......................................52
[1] A.V. Oppenheim and R.W. Schafer, Discrete-Time Signal Processing, Prentice Hall, 1989.
[2] 林宸生、邱創乾、陳德請,數位信號處理實務入門,1996,高立書局
[3] B.H. Juang, L.R. Rabiner, and J.G. Wilpon, “On the Use Bandpass Filtering in Speech Recognition,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. 35, No.7, pp. 947-954, July 1987.
[4] R.J. Schalkoff, Pattern Recognition: Statistical, Structural, and Neural Approaches, Wiley, New York, 1992.
[5] L.R. Rabiner and B.H. Juang, “An Introduction to Hidden Markov Model,” IEEE ASSP Magazine, pp. 4-16, Jun. 1986.
[6] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. of the IEEE, Vol. 77, No.2, pp. 257-286, Feb. 1989.
[7] Jean-Luc Gauvain and Chin-Hui Lee, “Improved Acoustic Modeling with Bayesian Learning, “ vol 1, pp481-484, ICASSP 1992.
[8] Jean-Luc Gauvain and Chin-Hui Lee, “Bayesian Learning for Hidden Markov Model with Gaussian Mixture, “ Speech Communication, vol 11, pp205-213, 1992.
[9] Jean-Luc Gauvain and Chin-Hui Lee, “Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains, “IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, April 1994.
[10] Chin-Hui Lee, Jean-Luc Gauvain “Speaker Adaptation Based on MAP Estimation of HMM Parameters, “ ICASSP''''93, pp II-558-561.
[11] 林宸生,數位信號-影像與語音處理,1996,全華書局
[12] T.W. Parsons, Voice and Speech Processing, McGraw-Hill, 1986.
[13] J.R. Deller, Discrete-Time Processing of Speech Signals, Macmillan, 1993.
[14] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice Hall, 1993.
[15] Xuang, X.D. and Ariki, Y. and Jack, M.A. “Hidden Markov Models for Speech Recognition, “ Edinburgh University Press, chap 7, pp 187-205, 1990.
[16] Yumin Lee and Lin-Shan Lee “Continuous Hidden Markov Models integrating transitional and instantaneous features for Mandarin syllable recognition, “ Computer Speech and Language, vol 7, pp 247-263, 1993.
[17] X.D. Huang and K.F. Lee, “On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition,” IEEE Trans. on Speech and Audio Processing, Vol. 1, No. 2, pp. 150-157, Apr. 1993.
[18] K.F. Lee, Large-Vocabulary Speaker-Independent Continuous Speech Recognition: The SPHINX System, Ph.D Dissertation, Computer Science Department, Carnegie Mellon University, Apr. 1988.
[19] Chin-Hui Lee, Chih-Heng Lin and Biing-Hwang Juang “A Study on Speaker Adaptation of Continuous Density HMM Parameters, “ vol 1, pp 145-148, ICASSP 1990.
[20] E.Frangoulis, V. Sgardoni “A novel Speaker Adaptation approach for Continuous Densitites HMM''''s, “ICASSP''''91, pp 861-864.
[21] L. R. Rabiner, J. G. Wilpon, and B. H. Juang “A Segmental K-means training procedure for connected word recognition based on whole word reference patterns, “ AT&T Tech. J. vol. 65, no.3, pp 21-31, May/June 1986.
[22] L.B. Jackson “Digital Filters and Signal Processing with MATLAB Exercises, Kluwer Academic Publishers, “1995.
[23] C.S. Burrus, “Computer-Based Exercises for Signal Processing Using MATLAB, “Prentice - Hall, 1994.
[24] 羅常培,漢語音韻學導論,1982,里仁書局
[25] 王漢蘭,隱藏式馬可夫模型應用於語音訊號辨識之研究,1999,中原大學資訊工程所碩士論文
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊