跳到主要內容

臺灣博碩士論文加值系統

(98.80.143.34) 您好!臺灣時間:2024/10/03 19:59
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:郭桂廷
研究生(外文):Kuei-Ting Kuo
論文名稱:基於基因遺傳演算法最佳時序濾波器之應用強健性語音辨識
論文名稱(外文):GA-Based Optimization of Temporal Filter for Robust Speech Recognition
指導教授:吳俊德吳俊德引用關係
指導教授(外文):Gin-Der Wu
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:95
語文別:英文
論文頁數:38
中文關鍵詞:基因遺傳演算法強健性語音辨識梅爾倒頻譜
外文關鍵詞:genetic algorithms (GA)robust speech recognitionMel-frequency cepstral coefficients (MFCC)
相關次數:
  • 被引用被引用:0
  • 點閱點閱:465
  • 評分評分:
  • 下載下載:105
  • 收藏至我的研究室書目清單書目收藏:0
此篇論文提出一個針對噪音環境下的強健性語音辨識技術,我們將此一技術應用於語音辨識系統中,此辨識系統使用梅爾倒頻譜係數(Mel-frequency cepstral coefficients, MFCC)作為語音特徵參數,而且利用隱藏式馬可夫模式(Hidden Markov Models, HMM)作為辨識演算法。因為語音辨識系統可以藉由時序濾波器(temporal filter)提高它的辨識結果,所以找到最佳化的時序濾波器是決定辨識結果好壞的主要關鍵因素。因此我們採用基因遺傳演算法(genetic algorithms, GA)動態的選擇出最佳的時序濾波器,進而獲得具有強健性的語音特徵參數。在此系統中,GA將每20組時序濾波器視為一個染色體,1個遺傳池(genetic population)裡10個染色體,GA會根據遺傳演化法則選出最適合的20組具有梅爾頻譜特性的時序濾波器。最後,我們利用此一系統辨識中文數字(0-9),分別由20人提供2000個語音檔,選用一半資訊作為參考音檔,另一半則作為測試音檔,其實驗結果,在0db的噪音環境底下,可獲得44.5%的正確辨識率。
This thesis proposed a new robust speech recognition technique in noisy environment. The feature extraction bases on MFCC (Mel-frequency cepstral coefficients), and template matching employs Hidden Markov Models (HMM). Since the performance of speech recognition can be improved by using temporal filters, we focus on the optimization of these filters. Hence, we adopt genetic algorithms (GA) to dynamically select the proper temporal filters in order to obtain the robust MFCC. For Mel-scale banks, there are totally 20 triangular banks. Hence, there are 20 corresponding temporal filters which are encoded into the chromosome. We use 10 chromosomes in the genetic population. Finally, we do the experiment, it adopt Chinese digit (0-9) words form 20 speakers. Everyone speaks 10 times. One half people speak as reference data, other as test data. The recognition rate can attain 44.5% in 0db SNR.
Acknowledgments
Abstract in Chinese i
Abstract in English ii
Contents iii
List of figures v
List of tables vii
1. Introduction 1
1.1 Motivation 1
1.2 Introduction of noisy speech recognition 1
1.3 Introduction of GA 2
1.4 Thesis organization 2
2. Review Speech Recognition 4
2.1 Speech Feature Extraction 4
2.1.1 Pre-emphasis 5
2.1.2 Frame and Windowing 5
2.1.3 Fast Fourier Transform 6
2.1.4 Mel-Frequency cepstral coefficient(MFCC) 6
2.2 Recognition algorithm-Hidden Marked Model 8
2.2.1 Forward and backward procedure 9
2.2.2 Viterbi algorithm 12
3. Robust Speech Feature 14
3.1 Temporal filter 14
3.2 Genetic algorithms(GA) 16
3.2.1 Reproduction 17
3.2.2 Crossover 19
3.2.3 Mutation 20
3.3 Implement 21
3.3.1 Coding 23
3.3.2 Initialization 24
3.3.3 Fitness assignment 24
3.3.4 Reproduction 24
3.3.5 Crossover 25
3.3.6 Mutation 26
3.3.7 Ending 26
4. Experiment and Results 27
4.1 Specification 27
4.2 Experiments 29
4.2.1 Experiment 1 29
4.2.2 Experiment 2 29
4.3 Result 30
4.3.1 Result of Experiment 1 30
4.3.2 Result of Experiment 2 32
5. Conclusions 36
Bibliography 37
[1] Rabiner L., Juang B.H., Fundamentals of speech recognition, Prentice Hall inc., New Jersey, 1993.
[2] Boll. S.,” Suppression of acoustic noise in speech using spectral subtraction”, Acoustics, Speech, and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Transactions on, Vol. 27, Issue 2, Apr. 1979, P.P 113 – 120
[3] Lockwood P. and Boudy J., “Experiments with a Nonlinear Spectral Subtractor(NSS), Hidden Markov Models and the projection, for Robust Speech Recognition in Cars”, Eurospeech 1991
[4] Neumeyer L. and Weintraub M.,” Probabilistic optimum filtering for robust speech recognition”, Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on, Vol. 1, April 1994, P.P 417 - 420
[5] Mansour D. and Juang B.H.,” The short-time modified coherence representation and noisy speech recognition”, Acoustics, Speech, and Signal Processing IEEE Transactions on, Vol. 37, Issue 6, June 1989, P.P 795 – 804
[6] Hermansky H. and Morgan N.,” RASTA processing of speech”, Speech and Audio Processing, IEEE Transactions on, Vol. 2, Issue 4, Oct. 1994, P.P 578 - 589
[7] Varga A.P. and Moore R.K.,” Hidden Markov model decomposition of speech and noise”, Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on, Vol. 2, April 1990, P.P 845 - 848
[8] Moreno P.J., Raj B. and Stern R.M.,” A vector Taylor series approach for environment-independent speech recognition”, Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on, Vol. 2, May 1996, P.P 733 - 736
[9] Sankar A. and Lee C.H.,” Robust speech recognition based on stochastic matching”, Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on, Vol. 1, May 1995, P.P 121 - 124
[10] Gales M.J.F. and Young S.J., ” Robust continuous speech recognition using parallel model combination”, Speech and Audio Processing, IEEE Transactions on, Vol. 4, Issue 5, Sept. 1996, P.P 352 - 359
[11] Goldberg D., Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA, 1989.
[12] Fogel D.B. and Ghozeil A., “Using fitness distributions to design more efficient evolutionary computations”, Evolutionary Computation, 1996., Proceedings of IEEE International Conference on, May 1996, P.P 11 – 19
[13] Rabiner L.R. and Juang B.H., Fundamentals of Speech recognition, Prentice Hall, 1993
[14] Lin C.T., Nein H.W. and Hwu J.Y., “GA-based noisy speech recognition using two-dimensional cepstrum”, Speech and Audio Processing, IEEE Transactions on, Vol.8, Issue 6, Nov. 2000, P.P 664 - 675
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top