跳到主要內容

臺灣博碩士論文加值系統

(54.161.24.9) 您好!臺灣時間:2022/01/17 12:49
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃柏叡
研究生(外文):Bo-Ruei Huang
論文名稱:即時單字音辨識系統之設計
論文名稱(外文):Design of On-Line Isolated Word Recognition Systems
指導教授:陳永平陳永平引用關係
指導教授(外文):Yo-Ping Chen
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電機與控制工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:53
中文關鍵詞:單音辨識隱藏式馬可夫模型即時
外文關鍵詞:isolated word recognitionHidden Markov Modelreal time or on-line
相關次數:
  • 被引用被引用:0
  • 點閱點閱:197
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
本論文之重點在於即時語單音辨識系統的設計與應用,以傳統的馬可夫模型為基礎;再結合擁有學習功能的向量量化演算法,使整個系統不但提高了辨識率並且計算量也大為減少;進而縮短辨識所需的反應時間,以符合即時語音辨識系統的需求。而在語音學習方面,我們將辨識錯誤的聲音直接與線上資料庫作修正,以快速並有效的方式達到我們所要的需求,以提高其辨識率。對於語音的訓練,針對未特定語者的語音辨識,我們使用維特比演算法來找出最佳的隱藏狀態序列,並根據這個序列上的狀態,利用k-means演算法來作分群,得到新的模型參數,將原本眾多的模型參數減少到一定有效的量,以符合所需。
除了單音辨識外,即時連續數字辨識也將大略的介紹,我們是利用以動態時間較準的方式為基礎的一階狀態演算法來求得連續語音的連接點,進而求得正確的辨識結果。利用一階演算法來作連續數字辨識,其資料庫的模型參數將會大大的影響其辨識結果,故其要慎選其訓練語料。

The major concept in this thesis is about the design of on-line isolated words recognition systems. By combining with the traditional hidden Markov model and learning vector quantization, not only the recognition accuracy will increase but also computations will decrease. Therefore, the reaction time of the speech recognition will also decrease. By using the learning function to adjust directly the on-line database with the misclassified patterns, it will be fast and useful to improve the recognition accuracy. For the data training, the Viterbi algorithm has been used to find the best stat sequence for speaker-independent. And the k-means algorithm has been also used to cluster the mean vectors and variance vectors in each state in order to decrease the number of models in our database.
Besides, the connected digits recognition will be introduced conceptually. The one-state algorithm based on the dynamic time warping is used to recognize the connected digits. However, the models of the database will influence greatly the recognition result by using the one-state algorithm. So the training data must be chosen carefully.

Contents
Chinese Abstract……………………………………………………………………….i
English Abstract……………………………………………………………………….ii
Contents……………………………………………………………………………….iv
List of Tables………………………………………………………………………..vi
List of Figures……………………………………………………………………...vii
1. Introduction……………………………………………………………………..1
2. Basic Concepts for On-Line Speech Recognition……………………………..4
2.1 The Framework of Speech Recognition Systems…………………………..4
2.2 The Real Time Speech Recognition Systems………………………………5
3. Pre-Processing for Speech Recognition………………………………………..7
3.1 Speech Detection………………………………………………………… .7
3.2 Feature Extraction………………………………………………………….8
3.2.1 The LPC Model………………………………………………….….8
3.2.2 Short-Term Estimates of Autocorrelation..…..……………………10
3.2.3 LPC Analysis Equations…………………………………………...10
3.2.4 LPC Processor for Speech Recognition…………………………...11
4. Related Hybrid Applications of HMM for Isolated Words Recognition Systems…………………………………………………………………………16
4.1 Hidden Markov Model……………………………………………………16
4.1.1 Elements of HMM…………………………………………………17
4.1.2 The Three Basic Problems for HMMs…………………………….18
4.1.3 Continuous Observation of HMM…………………………………22
4.1.4 Procedures of Model-Training and Recognition in HMM………...23
4.2 Combination of HMM and Learning Vector Quantization……………….26
4.2.1 Basic Concepts of the Hybrid Algorithm………………………….26
4.2.2 Procedures of Model-Training and Recognition in HMM-LVQ…..27
4.3 Applications of Speaker Adaptation………………………………………31
4.3.1 Basic Concepts of Speaker Adaptation…………...……………….31
4.3.2 Bayesian Adaptation of the Parameters in HMM…………………33
4.3.3 Speaker Adaptation Based on the HMM-LVQ Hybrid Algorithm..34
5. Application of Connected Digits Recognition Systems…………….……..…37
5.1 The One-State Algorithm Bases on Dynamic Time Warping…………….37
5.1.1 Dynamic Time Warping…………………………………………...38
5.1.2 The One-Stat Algorithm……………………..……………………40
6. Experimental Results………………………………………………………….43
6.1 Speaker Dependent Systems for IWR…………………………………….43
6.2 Speaker Independent Systems for IWR…………………………………..45
6.3 Speaker Dependent Systems for CWR……………………………………47
7. Conclusion and Future Work…………………………………………………49
BIBLIOGRAPHY…..………………………………………………………………51

[1] L.R. Rabiner, B. —H. Juang, “Fundamentals of speech recognition”, Prentice- Hall, United States of American, 1993.
[2] L.R. Rabiner, “A tutorial no Hidden Markov models and selected applications in speech recognition”, Proc. IEEE, Feb. 1989, pp. 257-286.
[3] L.R. Rabiner and B. —H. Juang, “An introduction to Hidden Markov models”, IEEE ASSP May., Jan. 1986, pp. 4-26.
[4] F. Itakura, “Minimum prediction residual principle applied to speech recognition”, IEEE Tran. On Acoustics, Speech, and Signal Processing, Vol. 23, No. 1, Feb.1975, pp. 67-72.
[5] B. —H. Juang, “On the Hidden Markov model and dynamic time warping for speech recognition-a unified view”, AT&T, B.S.T.J., Vol. 63, No. 7, Sep. 1984.
[6] R.M. Gray, “Vector quantization”, IEEE ASSP. May., April 1984, pp.4-28.
[7] D.K. Burto, J.E. Shore and J.T. Buck, “Isolated-word speech recognition using multi-section vector quantization codebooks”, IEEE Tran. On ASSP, Vol. 33, No. 4, Aug. 1985, pp. 837-849.
[8] T. Kohonen, “ The neural phonetic typewriter”, IEEE Computer, Mar. 1988.
[9] B. Gold and R.P. Lippmann, “A neural network for isolated-word recognition”, In ICASSP 88, Vol. 88, 1988, pp. 44-47.
[10] Li Liu, J. He, and G. Palm, “Signal modeling for speaker identification”, Proc. IEEE, Mar. 1996, pp. 665-668.
[11] M. Gadallah, E. Soleit, “ Noise immune speech recognition system”, In Proc. NRSC’99. Feb. 1999, Cairo, Egypt.
[12] J. G. Wilpon and L.R. Rabiner, “A modified K-means clustering algorithm for use in isolated word recognition”, IEEE Tran. Acoustics, Speech, Signal Proc., ASSP-33 (3): 587-594, June 1985.
[13] Hui-Ling Lou, “Implementing the Viterbi algorithm”, IEEE Signal Processing Magazine, Vol. 12, Issue 5, pp. 42-52, September 1995.
[14] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang, “Phoneme recognition: Neural networks vs. hidden Markov models”, in Proc. ICASSP 99, pp. 107-110, 1988.
[15] H. Sakoe, R. Isotani, K. Yoshida, and T. Watanabe, “ Speaker independent word recognition using dynamic programming neural networks”, in Proc. ICASSP 89, vol. 1, pp. 29-32, 1989.
[16] S. Katagiri and C. H. Lee, “ A new speech recognition algorithm based on HMM and LVQ”, in Proc. ASJ Fall Meeting, 2-P-7, pp. 143-144, 1990.
[17] S. Katagiri and C. H. Lee, “ A new HMM/LVQ hybrid algorithm for speech recognition”, In Proc. GLOBECOM 90, 608.2 , pp. 1032-1036, 1990.
[18] S. Katagiri and C. H. Lee, “A new hybrid algorithm for speech recognition based on HMM segmentation and learning vector quantization”, In IEEE Tran. On Speech and Audio Processing, Vol. 1, No.4, Oct. 1993, pp. 421-431.
[19] C. —H. Lee, C. —H. Lin, and B. —H. Juang. “A study on speaker adaptation of the parameters of continuous density Hidden Markov models”, IEEE Tran. Acoust., Speech , Signal Processing, Vol. 39, No. 4, Apr. 1991, pp 806-814.
[20] P. F. Brown, C. —H. Lee, and J. C. Spohrer, “Bayesian adaptation in speech recognition”, in Proc. ICASSP-83, Boston, May, 1983, pp. 761-764.
[21] J. —L. Gauvain and C. —H. Lee, “Bayesian learning of Gaussian mixture densities for Hidden Markov models”, in Proc. DARPA Speech Natural Language Workshop, Feb. 1991, pp. 272-277.
[22] H. Sakoe, S. Chiba, “Dynamic Programming Algorithm Optimization for Spoken Word Recognition”, IEEE Trans. on acoustics, speech and signal processing, vol. 26, no.1, pp.43-49, 1978.
[23] R. Schwartz, Y. L. Chow,and F. Kubala, “Rapid speaker adaptation using a probabilistic spectral mapping”, In Proc. ICASSP 87, (DALLAS, TX), Apr. 1987, pp.633-636.
[24] R. M. Stern and M. J. Lasry, “Dynamic speaker adaptation for feature-based isolated word recognition”, IEEE Tran. Acoust., Speech, Signal Processing, Vol. 35, No. 6, June,1987.
[25] J. S. Briddle, M. D. Brown, and R. M. Chamberlain, “An Algorithm for Connected Word Recognition”, Automatic Speech Analysis and Recognition, edited by J. P. Haton, D. Ridddle Publishing Co., Dordrecht, Holland, 1982, pp.191-204.
[26] L. R. Rabiner, J. G. Wilpon, and B. H. Juang, “A Model-Based Connected-Digit Recognition System either Hidden Markov Models or Templates,” Computer Speech & Language, Vol. 1, number 2, December, 1986.
[27] L. R. Rabiner, J. G. Wilpon, F. K. Soong, and A. E. Rosenberg, “High Performance Connected digit Recognition using Hidden Markov Models,” submitted for publication.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top