(3.238.99.243) 您好!臺灣時間:2021/05/17 00:30
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:林盟凱
研究生(外文):Meng-kai Lin
論文名稱:特徵參數之次方調整法應用在強健性語音辨識
論文名稱(外文):Feature Exponent Adjustment Methods in Robust Speech Recognition
指導教授:洪志偉洪志偉引用關係
指導教授(外文):Jeih-Weih Hung
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:中文
論文頁數:86
中文關鍵詞:語音辨識語音強化法強健式語音特徵技術次方根調整
外文關鍵詞:speech recognitionspeech enhancement methodrobust speech representationexponent adjustment
相關次數:
  • 被引用被引用:0
  • 點閱點閱:124
  • 評分評分:
  • 下載下載:20
  • 收藏至我的研究室書目清單書目收藏:0
由於發展環境和應用環境兩者之間的不匹配,導致於語音辨識系統效能經常會下降,而引起這不匹配的主要原因之一是加成性雜訊,處理加成性雜訊的方法我們可以分成三類,語音強化法、強健性語音特徵參數、以及語音模型調適法,而本論文所討論的方法 是屬於語音強化法技術。
在本論文中,我們探討的方法和提出的方法之共同特徵都是用次方根來做處理,當次方根完成在標準的梅爾倒頻譜係數中,我們稱之為倒頻譜次方調整法(CEA)。另一方面,當次方根實現在取對數後的頻譜或是直接取代對數處理,再把使用次方根處理過後的梅爾倒頻譜係數做差量處理,我們分別稱為梅爾次方根調整法(ExpoMFCC)以及根次方調整法(RMFCC)。因此,三種方法的使用我們會得到一個新的語音特徵參數來對付嘈雜的環境。
實驗的結果我們也證明這些方法顯然可以強化語音特徵參數進而來改進我們的辨識率,此外,我們也結合其他強健性的方法進一步來得到較好的辨識率。
The performance of a speech recognition system is often degraded due to the mismatch between the environments of development and application. One of the major sources that give rise to this mismatch is, additive noise. The approaches for handling the problem of additive noise can be divided into three classes, speech enhancement, robust representation of speech, and compensation of speech models. In this thesis, the discussed methods belong to the first class, speech enhancement techniques.
A common characteristic of our studied and proposed approaches in this thesis is the processing of exponentiation. When the exponentiation is performed on the original mel-frequency cepstral coefficients (MFCC), the resulted method is called cepstral exponent adjustment (CEA). On the other hand, when the exponentiation is carried out on the logarithmic spectrum or directly replace the logarithm operation, during the derivation process of MFCC, the resulted algorithms are called Exponentiated log-MelFBS (ExpoMFCC) and root Mel-filter bank spectrum (RMFCC), respectively. As a result, the three and applied to obtain new speech features for recognition in a noisy environment.
Experimental results show that they apparently enhance the robustness of the speech features and thus improve the recognition accuracy. Moreover, they can be integrated with other robustness to obtain further improvement.
摘要 ii
Abstract iii
目錄 v
圖目錄 vii
表目錄 ix
第一章 緒論
1.1 研究動機及主題 1
1.2 研究方法簡介 3
1.3 論文架構 5
第二章 語音訊號特徵參數及模型之建立
2.1語音特徵參數之抽取 7
2.2語音聲學模型的建立 15
第三章 基本系統之建立
3.1 實驗語料庫 17
3.2 辨識效能評估 18
3.3 基本系統的訓練和結果 19
3.4 本章結論 26
第四章 強健性語音特徵參數技術與強化法技術
4.1 強健性語音特徵參數 27
4.1.1 梅爾次方根調整法 27
4.1.2 根次方調整法 33
4.2 語音強化法技術 37
4.2.1 倒頻譜次方調整法 38
4.3 本章結論 42
第五章 強健性語音技術與強化法技術之實驗結果
5.1 梅爾次方根調整之實驗結果 43
5.2 根次方調整法之實驗結果 55
5.3 倒頻譜次方法之實驗結果 65
5.4 本章結論 78
第六章 結論與未來展望
6.1 結論 81
6.2 未來展望 83
參考文獻 84
[1]王小川, "語音訊號處理" , 全華科技圖書, 2004.
[2]Boll, S.F. "Supperssion of Acoustic Noise in Speech Using Spectral Subtraction", IEEE Trans. on ASSP, Vol.27, No.2, pp.133-120.1979.
[3]S.Furui, "Cepstral Analysis Technique for Automatic Speaker Verification", IEEE Trans. Acoust. Speech Signal Process. 1981.
[4]O.Viikki and K. Laurila, "Noise Robust HMM-based Speech Recognition Using Segmental Cepstral Feature Vector Normalization", in ESCA NATO Workshop Robust Speech Recognition Unknown Communication Channels, Pont-a-Mousson, France, 1997, pp. 107-110.
[5]J.L. Gauiain and C.H. Lee, "Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains", IEEE Trans. on Speech and Audio Processing, 1994.
[6]J.W. Hung, J.L. Shen and L.S. Lee, "New Approaches for Domain Transformation and Parameter Combination for Improved Accuracy in Parallel Model Combination (PMC) Techniques", IEEE Trans. on Speech and Audio Processing, Nov. 2001.
[7]C.J. Leggetter and P.C. Woodland, "Maximum Likelihood Linear Regression for Speaker Adaptation of Continous Density Hidden Markov Models", Computer Speech and Language, 1995.
[8]V.Tyagi and C.Wellekens, "ON DESENSITIZING THE MEL-CEPSTRUM TO SPURIOUS SPECTRAL COMPONENTS FOR ROBUST RECOGNITION", ICASSP 2005, 529-532.
[9]P.Alexandre and P.Lockwood, "Root Cepstral Analysis:A unified view Application to speech processing in car noise environments", Speech Communication,Vol.12, pp 277-288, 1993.
[10]S.B.Davis and P.Mermelstein, "Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences", IEEE Trans. on ASSP, Vol.ASSP-28, No.4 , August 1980.
[11]林宗成, "The techniques of Energy Contour Enhancement, Spectral Exponent Adjustment and Signal Autocorrelation for Robust Speech Recognition", 國立暨南國際大學碩士論文, June 2004.
[12]Ananthakrishnan, K.S. "A comparison of modified k-means(HMM) and NN based real time adaptive clustering algorithms for articulatory space codebook formation", In ICSLP -1996, 1253-1256.
[13]Steve Young, Gunnar Evermann, Dan Kershaw, "The HTK Book (for HTK Version 3.2)" , Microsoft Corporation, 2002.
[14]Claudio Becchetti and Lucio Prina ricotta, "Speech Recognition Theory and C++ Implementation", HOHN WILEY & SONS, LTD, August 1999.
[15]J.H.L.Hansen, R.Sarikaya, U. Yapanel and B.L. Pellom, "Robust Speech Recognition in noise:An Evaluation Using the SPINE Corpus", submitted for publications ICASSP 2001.
[16]Ruhi Sarilaya, John H.L. Hansen, "Analysis of the Root-Cepstrum for Acoustic Modeling and Fast Decoding in Speech Recognition" , EUROSPEECH 2001, 2001.
[17]J.S.Lim, "Spectral Root Homomorphic Deconvolution system", IEEE Trans. on ASSP, Vol. ASSP-27, No.3, June 1979.
[18]R.Stern, A.cero, F.H. Liu, and Y.Ohshima, "Signal processing for robust speech recognition", Automatic Speech and Speaker Recognition. Advanced Topics. Kluwer Academic Pub, pp. 357-384, 1997.
[19]賴辰瑋, "The Research on the Voice Activity Detection and Speech Enhancement for Noisy Speech Recognition ", June 2004.
[20]P.ALEXANDRE, J.BOUND, P,LOCKWOOD, "ROOT HOMOMORPHI SCHEMES FOR SPEECH IN CAR NOISE ENVIRONMENTS", 99-102, IEEE 1993.
[21]Sanjit K. Mitra, "Digital Signal Processing", Second Edition, Mc Graw Hill, pp.289, 2002.
[22]黃志楠, "Improved Techniques for Speech Recognition Under Additive and Covolutional Noisy Environment" , 國立台灣大學碩士論文, June 2001.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top