(54.236.62.49) 您好!臺灣時間:2021/02/26 08:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:李胥瑜
研究生(外文):Xu-Yu Li
論文名稱:結合基因演算法及經驗模態分解進行強健性語音辨識與FPGA晶片實現
論文名稱(外文):The FPGA Implementation of Robust Speech Recognition System by Combining Genetic Algorithm and Empirical Mode Decomposition
指導教授:潘欣泰
指導教授(外文):Shing-Tai Pan
學位類別:碩士
校院名稱:國立高雄大學
系所名稱:資訊工程學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:98
語文別:中文
論文頁數:69
中文關鍵詞:經驗模態分解基因演算法整數傅立葉轉換語音辨識隱藏式馬可夫模型FPGA
外文關鍵詞:EMDFPGAGAinteger fast Fourier transformspeech recognitionHMM
相關次數:
  • 被引用被引用:1
  • 點閱點閱:265
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:47
  • 收藏至我的研究室書目清單書目收藏:0
本論文以實現一個強健性的語音辨識系統於FPGA平台上為主要目標。為了加快語音辨識處理的速度,以適合於速度較慢的嵌入式平台上,採用了整數快速傅立葉轉換取代浮點數快速傅立葉轉換,在辨識率些微下降下,大幅提高了快速傅立葉轉換的運算速度。對於含雜訊的語音,採用了黃鍔院士所提出的經驗模態分解法(Empirical Mode Decomposition, EMD)將含雜訊的語音訊號分離成多組本質模態函式(Intrinsic Mode Function, IMF),針對不同本質模態函式做語音辨識上之實驗,以期能找出本質模態函式與語音雜訊間之關係,並以實數型基因演算法找出本質模態函式之權重,將分離出之本質模態函式依權重還原成語音後,能降低雜訊之干擾,提升語音辨識系統抗雜訊之能力。
In this thesis, we propose a robust speech recognition system constructed by utilizing HMM; the system utilizes Integer FFT to improve the computing performance and had been implemented in the FPGA embedded system. In the embedded system, the computing speed is not as fast as personal computer which would cause the speech recognition to consume more computing and power; in further, the requirement of real-time recognition would not be met. In the proposed method, we utilize Integer FFT to replace Float FFT and the experimental result shows that much computing time could be reduced while only a little recognition accuracy is lost. In addition, we also utilize Empirical Mode Decomposition to divide speech signals into multiple Intrinsic Mode Functions, and analyze the correlation between speech signal and noise in each Intrinsic Mode Function. Finally, we utilize Genetic Algorithm to find the optimal composing weight, and then, use it to reassemble the multiple Intrinsic Mode Functions to the speech signal for speech recognition.
第一章 緒論 1
1.1 研究動機 1
1.2 研究方法 2
第二章 語音的前置處理 3
2.1 語音取樣 3
2.2 音框 4
2.3 端點偵測 5
2.4 預強調 6
2.5 漢明窗 6
2.6 整數快速傅立葉轉換 7
2.7 梅爾頻率倒頻譜參數 11
第三章 隱藏式馬可夫模型 14
3.1 向量量化 14
3.2 隱藏式馬可夫模型 15
3.3 計算模型機率 16
3.4 訓練模型參數 17
3.5 辨識 19
第四章 EMD訊號分解與語音辨識 21
4.1 瞬時頻率 21
4.2 本質模態函式 22
4.3 三次仿樣函式 23
4.4 經驗模態分解法 26
4.5 經驗模態分解法與語音辨識 28
第五章 基因演算法 32
5.1 決定變數和計價函式 33
5.2 實數型和離散型參數 34
5.3 天擇 36
5.4 選擇 37
5.5 交配 38
5.6 突變 39
第六章 實驗環境與結果 40
6.1 實驗參數設定與硬體 40
6.2 實驗方法 43
6.3 實驗結果 46
第七章 結論與展望 52
7.1 結論 52
7.2 未來與展望 52
參考文獻 54
[1]王小川,語音信號處理,全華科技出版社,2004。
[2]藍敏倫,基因演算法應用於以類神經網路為基礎的語音辨識效果之改善,樹德科技大學資訊工程學系碩士論文,2006。
[3]唐華南,以隱藏式馬可夫模型、向量量化與語言文法為基礎的中文語音辨識系統,雲林科技大學電子與資訊工程學系碩士論文,2000。
[4]陳厚君,經驗模態分解法之語音辨識,中央大學電機工程研究所碩士論文,2005。
[5]吳光杰,加成性雜訊環境下倒頻譜統計正規化法於強健性語音辨識之研究,暨南大學電機工程研究所碩士論文,2008。
[6]C. Neves, A. Veiga, L. Sa, and F. Perdigao, “Efficient Noise-robust Speech Recognition Front-end Based on The ETSI Standard,” International Conference on Signal Processing, pp. 609-612, 26-29 Oct. 2008.
[7]J. McAuley, J. Ming, D. Stewart, and P. Hanna, “Subband Correlation and Robust Speech Recognition,” IEEE Transactions on Speech and Audio Processing, Vol. 13, Iss. 5, pp. 956-964, Sept. 2005.
[8]B.A. Sonkamble and D.D. Doye, “An Overview of Speech Recognition System Based on The Support Vector Machines,” International Conference on Computer and Communication Engineering, pp. 768-771, 13-15 May 2008.
[9]C. Wan and L. Liu, “Research and Improvement on Embedded System Application of DTW-based Speech Recognition,” International Conference on Anti-counterfeiting, Security and Identification, pp. 401-404, 20-23 Aug. 2008.
[10]T. Kinjo and K. Funaki, “On HMM Speech Recognition Based on Complex Speech Analysis,” IEEE Industrial Electronics, IECON, pp. 3477-3480, 6-10 Nov. 2006.
[11]T. Jitsuhiro, T. Toriyama, and K. Kogure, “Robust Speech Recognition Using Noise Suppression Based on Multiple Composite Models and Multi-pass Search,” IEEE Workshop on Automatic Speech Recognition & Understanding, pp. 53-58, 9-13 Dec. 2007.
[12]H. Sheikhzadeh and L. Deng, “Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization,” IEEE Transations on Speech and Audio Processing, pp. 80-89, Jan. 1994.
[13]E. Erzin, “Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings,” IEEE Transations on Audio, Speech, and Language Processing, Vol. 17, Iss. 3, pp. 1316-1324, Sept. 2009.
[14]T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd Edition, McGraw-Hill, 2002.
[15]S. H. Chen and Y. R. Wang, “Tone recognition of continuous Mandarin speech based on neural networks,” IEEE Transactions on Speech and Audio Processing, Vol. 3, Iss. 2, pp. 146-150, March 1995.
[16]M. C. Mozer, “Neural-network speech processing for toys and consumer electronics,” IEEE Expert, Vol. 11, Iss. 4, pp. 4-5, Aug. 1996.
[17]X. Huang, A. Acero, and H. Wuenon, Spoken Language Processing A Guide to Theory, Algorithm and System Development, Pearson, 2005.
[18]J. H. L. Hansen, and M. A. Clements, “Source generator equalization and enhancement of spectral properties for robust speech recognition in noise and stress,” IEEE Transactions on Speech and Audio Processing, Vol. 3, Iss. 5, pp. 407-415, Sept. 1995.
[19]Q. Li, J. Zheng, A. Tsai, and Q. Zhou, “Robust endpoint detection and energy normalization for real-time speech and speaker recognition,” IEEE Transactions on Speech and Audio Processing, Vol. 10, Iss. 3, pp. 146-157, March 2002.
[20]N. E. Huang, “The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis,” Proc. R. Soc. London, pp. 903-995, 1998.
[21]S. Haykin and B. V. Veen, Signals and Systems 2nd Edition, Wiley, 2003.
[22]A. V. Oppenheim, R. W. Schafer, and J. R Buck, Discrete-Time Signal Processing 2nd Edition, Pearson, 2005.
[23]S. Oraintara, Y. J. Chen, and T. Q. Nguyen, “Integer Fast Fourier Transform,” IEEE Transations on Signal Processing, Vol. 50, Iss. 3, pp. 607-618, March 2002.
[24]X. Li, X. Zou, R. Zhang, and G. Liu, “Method of speech enhancement based on Hilbert-Huang transform,” 7th World Congress on Intelligent Control and Automation, pp. 8419 - 8424, 25-27 June 2008.
[25]W. Wang, X. Li, and R. Zhang, “Speech Detection Based on Hilbert-Huang Transform,” First International Multi-Symposiums on Computer and Computational Sciences, Vol. 1, pp. 290-293, 20-24 June 2006.
[26]X. Zou, X. Li, and R. Zhang, “Speech Enhancement Based on Hilbert-Huang Transform Theory,” International Multi-Symposiums on Computer and Computational Sciences, Vol. 1, pp. 208-213, 20-24 June 2006.
[27]Z. F. Liu, Z. P. Liao, and E. F. Sang, “Speech enhancement based on Hilbert-Huang transform,” International Conference on Machine Learning and Cybernetics, Vol. 8, pp. 4908-4912, 20-24 June 2005.
[28]J. H. Mathews and K. D. Fink, Numerical Methods Using MATLAB, 4th Edition, Prentice-Hall, 2004.
[29]R. L. Haupt and S. E. Haupt, Practical Genetic Algorithms, 2nd Edition, Wiley, 2004.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔