跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.136) 您好!臺灣時間:2025/09/20 07:02
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:朱嘉平
研究生(外文):Chia-Ping Chu
論文名稱:總體經驗模態分解及其平行化處理應用在強健性語音辨識
論文名稱(外文):Robust Speech Recognition by Ensemble Empirical Mode Decomposition and its Parallel Processing
指導教授:潘欣泰
指導教授(外文):Shing-Tai Pan
學位類別:碩士
校院名稱:國立高雄大學
系所名稱:資訊工程學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2012
畢業學年度:100
語文別:中文
論文頁數:75
中文關鍵詞:平行運算基因演算法語音辨識隱藏式馬可夫模型總體經驗模態分解法
外文關鍵詞:Speech RecognitionHidden Markov ModelEnsemble Empirical Mode DecompositionParallel ComputingGenetic Algorithms
相關次數:
  • 被引用被引用:0
  • 點閱點閱:281
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文主要目的是提升語音信號的抗雜訊能力以提升在具有環境雜訊下的語音辨識率。本論文應用總體經驗模態分解法(Ensemble Empirical Mode Decomposition, Ensemble EMD),將含雜訊的語音訊號分解成多組本質模態函式(Intrinsic Mode Functions, IMFs),並以實數型基因演算法找出最佳IMFs組合參數,再將分離出之IMFs依組合參數還原成語音,還原最理想的語音信號,讓環境噪音影響語音的辨識率降到最低。此外,針對總體經驗模態分解法所造成的運算速度的問題,本論文提出平行化運算來加速總體經驗模態分解法的運算速度,在多核心系統的架構下,結合OpenMP函式庫平行指令針對總體經驗模態分解法做平行化處理使運算速度提升。
The main purposes of this study were to enhance and improve the speech recognition rate of speech recognition systems subject to some environment noise. In our research, we used Ensemble Empirical Mode Decomposition (Ensemble EMD) to decompose the speech signals with noise to several IMFs, and then find the best weights for each IMF by using real-coded genetic algorithm. Thereafter, the speech signals were recovered by summing the weighted IMFs to reduce the effect of the noise. Since the Ensemble EMD will take much computation time, a parallel computation algorithm under multi-core structure is proposed to speed up the computation of Ensemble EMD. We used parallel instruction coding in the OpenMP library to implement our algorithm.
第一章 緒論.............................................1
1.1 研究動機與目的...................................2
1.2 研究方法.........................................3
第二章 語音訊號前置處理.................................5
2.1 擷取語音之音框...................................6
2.2 語音預強調.......................................7
2.3 加入漢明窗.......................................7
2.4 快速傅立葉轉換...................................8
2.5 MFCC特徵值計算..................................10
2.5.1 梅爾濾波器組....................................10
2.5.2 對數轉換........................................12
2.5.3 離散餘弦轉換....................................13
第三章 語音訊號結合Ensemble EMD訊號分解................14
3.1 瞬時頻率........................................15
3.2 經驗模態分解法..................................16
3.3 總體經驗模態分解法..............................22
3.4 模態函數分解結合基因演算法......................25
3.5 平行化運算......................................26
第四章 隱藏式馬可夫模型與HTK...........................31
4.1 隱藏式馬可夫模型................................31
4.2 連續型隱藏式馬可夫模型..........................33
4.3 HTK工具.........................................35
第五章 實驗方式與結果..................................38
5.1 實驗語料........................................38
5.2 實驗方法與數據..................................41
5.2.1 基礎實驗的測試結果..............................42
5.2.2 語音訊號以EMD的測試結果.........................43
5.2.3 語音訊號以Ensemble EMD的測試結果................45
5.3 實驗結果分析....................................47
5.3.1 針對SNR 0 ~20dB的平均辨識率分析.................47
5.3.2 針對不同SNR dB值在不同測試環境之分析............49
5.3.3 針對不同SNR dB值辨識率提升幅度之分析............53
5.4 平行化加速實驗結果 ..............................58
第六章 結論與展望......................................62
6.1 結論............................................62
6.2 未來與展望......................................63
參考文獻.................................................64
[1] Z.Jin and D.L. Wang, “a multipitch tracking algorithm for noisy and reverberant speech, ” IEEE International Conference on Acoustics Speech andSignal Processing, pp. 4218-4221, 14-19 Mar. 2010.
[2] Y.I. Song, Y.Y. Wang, Y.C. Ju, M. Seltzer, I. Tashev and A. Acero “Voice search of structured media data,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp, 1941-1944, 19-24 Apr. 2009.
[3] E. Erzin, “Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings,” IEEE Transactions on Audio, Speech , and Language Processing, Vol. 17, No. 7, pp, 1316-1324, Sep. 2009.
[4] C.W. Hsu and L.S. Lee, “Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, No. 2, pp, 205-220, Feb. 2009.
[5] Y.k. Choi, K. You, J. Choi and W. Sung, “VLSI for 5000-word continuous speech recognition,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp, 557-560, 19-24 Apr. 2009.
[6] C. Wan and L. Liu, “Research and Improvement on Embedded System Application of DTW-based Speech Recognition,” International Conference on Anti-counterfeiting, Security and Identification, pp, 401-404, 20-23 Aug. 2008.
[7] 王小川,語音信號處理,全華科技出版社,2004.
[8] 陳松琳,以類神經為架構之語音辨識系統,中山大學電機工程系碩士論文,2002.
[9] M.C. Mozer, “Neural-network speech processing for toys and consumer electronics,” IEEE Expert, Vol. 11, No. 4, pp, 4-5, August. 1996.
[10] L. Rabiner and B.H. Juang, Fundamntals of Speech Recognition, Pentice-Hall International, Inc. 1993.
[11] T. Kinjo and K. Funaki, “On HMM Speech Recognition Based on Complex Speech Analysis,” IEEE Industrial Electronics IECON, pp, 3477-3480, 6-10 Nov. 2006.
[12] J. Yamagishi, T. Nose, H. Zen, Z.H. Ling, T. Toda, K. Tokuda, S. King and S. Renals, “Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, No. 6 pp, 1208-1230, Aug. 2009.
[13] T. Kobayashi, Y. Nakano, K. Ogata and J. Isogai, “Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, No. 1, pp, 66-83, Jan. 2009.
[14] Z.H. Ling, K. Richmond, J. Yamagishi and R.H. Wang, “Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, No. 6, pp, 1171-1185, Aug. 2009
[15] S.Theodorakis, A. Katsamanis and P. Maragos, “Product-HMMs for automatic sign language recognition,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp, 1601-1604, Apr. 2009.
[16] K. Yu, F. Mairesse and S. Young, “Word-level emphasis modelling in HMM-based speech synthesis,” IEEE International Conference on Acoustics Speech and Signal Processing, pp, 4238-4241, Mar. 2010.
[17] M. Dehghana, K. Faeza, M. Ahmadi and M. Shridharc, “Unconstrained Farsi handwritten word recognition using fuzzy vector quantization and hidden Markov models,” Pattern Recognition Letters, Vol. 22, Iss. 2, pp, 209-214, Feb. 2001
[18] U. Harun, O. Ali, R. Sarac and A. Arslan,” A biomedical system based on fuzzy discrete hidden Markov model for the diagnosis of the brain diseases,” Expert Systems with Applications, Vol. 35, Iss. 3, pp, 1104–1114, Oct. 2008.
[19] N.E. Huang, “The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis,” Proc. R. Soc. London, pp, 903-995, 1996.
[20] Xiong. Xiao, Siong. Chng, and Haizhou. Li, “Normalization of the Speech Modulation Spectra for Robust Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 16, No. 8, pp, 1162-1174, Nov. 2008.
[21] R.L. Haupt and S.E. Haupt, Practical Genetic Algorithms, 2nd Edition, Wiley, 2004.
[22] S. Haykin and B.V. Veen, Signals and System 2nd Edition, Wiley, 2003.
[23] V. Oppenheim, R.W. Schafer and J.R. Buck, DISCRETE-TIME SIGNAL PROCESSING 2nd Edition, Pearson; 1999.
[24] S. Oraintara, Y.J. Chen and T.Q. Nguyen, ”Integer fast Fourier transform,” IEEE Transactions on Signal Processing, Vol. 50, No. 3, pp, 607-618, Mar. 2002.
[25] H. Mathews, K.D. Fink and Numerical Methods Using MATLAB, 4th Edition, Prentice-Hall, 2004.
[26] X. Huang, A. Acero and H. Wuenon, Spoken Language Processing A Guide to Theory, Algorithm and System Development, Pearson, 2005.
[27] 8-bit Microcontroller with 4K Bytes In-System Programmable Flash AT89S51, ATMEL.
[28] http://www.altera.com/literature/hb/nios2/n2sw_nii5v2.pdf
[29] http://www.altera.com/literature/hb/qts/qts_qii5v4.pdf
[30] H.G. Hirsch and D. Pearce, “The Aurora Experimental Framework for The Performance Evaluation of Speech Recognition Systems Under Noisy Conditions,” in Proc. ISCA ITRW ASR2000, Sep. 2000.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊