跳到主要內容

臺灣博碩士論文加值系統

(44.222.218.145) 您好!臺灣時間:2024/02/29 13:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:蘇昭宇
研究生(外文):Chao-Yu Su
論文名稱:強健性語音辨認之研究:統計圖正規化演算法在語音上的研究與應用
論文名稱(外文):Robust Speech Recognition:Histogram Equalization
指導教授:洪志偉洪志偉引用關係
指導教授(外文):Jeih-weih Hung
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:中文
論文頁數:78
中文關鍵詞:語音統計圖正規化分位差高斯分佈
外文關鍵詞:Histogram EqualizationQuantileGaussian distribution
相關次數:
  • 被引用被引用:0
  • 點閱點閱:267
  • 評分評分:
  • 下載下載:30
  • 收藏至我的研究室書目清單書目收藏:1
科技產品日新月異,而介面的選擇則是產品便利性與實用的指標之ㄧ,而其中最貼近生活的,當屬語音,然而在真實情況下,語音辨識系統常常會跟環境產生不匹配的現象, 因此辨識率會有下降的情形,這是國內外研究人員積極想克服的問題,為了使語音更強健,不受到外來雜訊的影響,本論文研究了一些以統計圖為基礎的語音強健技術, 包含三類方向: (1)單邊匹配的語音統計圖正規化(unilateral histogram equali-zation),(2) 對稱匹配的語音統計圖正規化(bilateral histogram equalization),(3)多層次的語音統計圖正規法(multi-pass histogram equalization)。
在第四章中,我們採用單邊匹配,主要有兩大類方法,分別為分位差語音統計圖正規化法以及高斯參數式語音統計圖正規化法,原理即抓取乾淨訓練語料的統計圖作為參考,然後找出乾淨語音和雜訊語音間對應的關係,最後雜訊語音依此對應的關係映射得到特徵參數值,首先,我們先採用分位差語音統計圖正規化法,觀察處理過後辨識率提升的程度,接著,我們也討論了調整分位差數目以及開十次方根兩種提升辨識率的技巧,發現對辨識能力有小幅提升,而在第四章後半部,我們開始對高斯參數式語音統計圖正規化法作一些實驗及探討,分別把這個方法作在兩個領域上(對數梅爾頻譜,FBANK及倒頻譜,MFCC),表現較好的為MFCC上的高斯參數式語音統計圖正規化法,不過也有若干雜訊在FBANK上表現較好。第五章裡我們採用了對稱匹配的語音統計圖正規化,分別為倒頻譜特徵參數正規化以及高斯分佈式語音統計圖正規化, 我們可以把倒頻譜特徵參數正規化更趨近高斯分佈來增加辨識率,不過在不同的語料庫也不盡相同,相較於倒頻譜特徵參數正規化,Aurora2語料庫進步的幅度比Num-100來的明顯,而在第五章後半部,我們把語音通過單邊匹配及對稱匹配的統計圖正規化,在某些雜訊下提升效果不錯,但若我們若用高斯分佈式語音統計圖正規化的模型來做辨識,效果提升的幅度相當的不錯,辨識率為所有方法裡最高的。
統計圖正規化法是強健性語音的方法之ㄧ,從本論文的實驗結果來看,提升效果都有一定的水準,因此是個值得研究的方法。
The products of the technology change with each passing days and the choice of the interfaces is one of the practical sign of the product. And among them,the interface near to life most should be the speech. However under true circumstance, the speech recognition system will usually unmatched phenomenon is produced with environment, so recognition accuracy will be descendent situation. For making speech more robust, be free from the influence of foreign noise, this paper researches some methods regarding statistical chart(Histogram) as basal robust technique speech, including three type of directions: (1) unilateral histogram equalization (2) bilateral histogram equalization (3) multi- pass histogram equalization
In chapter 4, we adopt unilateral method to match. The method is divided into Quantile Based Histogram Equalization and Parametric Gaussian Histogram Equalization. The principle is to grab at the statistical chart of clean training data as the reference, and to find out relationship between clean speech and the noisy speech. The noisy speech is corresponded to the characteristic parameter according to this finally relationship. First, we adopt Quantile Based Histogram Equalization, observing the promotive degree of processed recognition accuracy. Immediately after, we also discussed the readjustment of the numbers of quantile and took 10th root method to promote recognition accuracy. In half after chapter 4 department, we started to make some experiments and study with Gauss Parametric Gaussian Histogram Equalization. We perform the method on log Mel-scaled Spectrum and on Cepstrum. The method has better performance on log Mel-scaled Spectrum. But in particular noise, it is sometimes quite good on Cepstrum.
In chapter 5,we adopted bilateral histogram equalization. The method is divided into Ceptral Normalization and Gaussian Histogram Equalization. We can turn to distribution after Ceptral Normalization near Gaussian distribution. Compared with Ceptral Normalization, the range of the progress in Aurora2 database is better than in Num-100 database. We passed the unilateral histogram equalization and the bilateral histogram equalization in half after chapter 5 department. But if we recognize speech with model trained with Gaussian Histogram Equalization, recognition accuracy is highest one of all methods.
Histogram Equalization reached certain level for speech robustness and it is a method worthy of studying.
Key words:Histogram Equalization,Gaussian distribution, Quantile based.
第一章 緒論
1.1研究動機 1
1.2強健性語音辨識方法的分類
1.3本論文的研究方向及成果
1.4章節大要
第二章實驗背景及基礎系統之建立
2.1語音資料庫簡介
2.2語音特徵參數抽取
2.3語音聲學模型的建立
2.4辨識效能的評估
2.5基礎系統之實驗結果
第三章 語音統計圖正規化的簡介
3.1 單邊匹配的語音統計圖正規化
3.1.1分位差語音統計圖正規化法
3.1.2 高斯參數式語音統計圖正規化法
3.2 對稱匹配的語音統計圖正規法
3.2.1高斯分佈式語音統計圖正規化
3.2.2多層次的語音統計圖正規法
第四章 單邊匹配的語音統計圖正規化的實驗結果
4.1 非線性分位差語音統計圖正規化法的結果
4.1.1 在對數梅爾頻譜的非線性分位差語音統計圖正規化
法實驗結果
4.1.2 在十次方根梅爾頻譜的非線性分位差語音統計圖正
規化法實驗結果
4.2 高斯參數式語音統計圖正規化法
4.2.1在對數梅爾頻譜的高斯參數式語音統計圖正規化法實
驗結果
4.2.2在倒頻譜上的高斯參數式語音統計圖正規化法實驗
結果
4.3 綜合比較與結論
第五章 對稱匹配的語音統計圖正規法的實驗結果
5.1倒頻譜特徵參數正規化的實驗結果
5.2高斯分佈式語音統計圖正規化的實驗結果
5.3多層次的語音統計圖正規法的實驗結果
5.4綜合比較與結論
第六章 結論與展望
6.1結論
6.2未來展望
[1] Y. Gong, “Speech Recognition in Noisy Environments: A Survey”, Speech Communication 16, 1995
[2] M.J.F. Gales, “Model-based Techniques for Noise Robust Speech Recognition ”, University of Cambridge, Sep. 1995
[3] Boll, S. F, “Suppression of Acoustic Noise in Speech Using Spectral Subtraction”,IEEE Trans. on ASSP, Vol. 27, No. 2, pp.113-120.1979
[4] P. Lockwood and J. Boudy, “Experiments with a Nonlinear Spectral
Subtractor (NSS) , Hidden Markov Models and the Projection, for Robust Speech Recognition in Cars”, Eurospeech 1991
[5] ITU-T Recommendation G.729 — Annex B: A silence compression sceme for G. 729 optimized for terminals conforming to Recommendation V.70
[6] B.A. Mellor and A.P. Varga, “Noise Masking in the MFCC Domain for the Recognition of Speech in Background Noise”, ICASSP 1992.
[7] Y. Ephraim and H.L. Van Trees, “A Signal Subspace Approach for Speech Enhancement”, IEEE Trans. on Speech and Audio Processing, 1995
[8] S. Furui, "Cepstral Analysis Technique for Automatic Speaker Verification". IEEE Trans. Acoust. Speech Signal Process. 1981
[9] O. Viikki and K. Laurila, “Noise Robust HMM-based Speech Recognition Using Segmental Cepstral Feature Vector Normalization,” in ESCA NATO Workshop Robust Speech Recognition Unknown Communication Channels, Pont-a-Mousson, France, 1997, pp. 107—110.
[10] H. Hermansky and N. Morgan, “RASTA Processing of Speech”. IEEE Trans. on Speech and Audio Processing. 2, pp. 578-589, 1994
[11] Kuo-Hwei Yuo and Hsiao-Chuan Wang, “Robust Features for Noisy Speech Recognition Based on Temporal Trajectory Filtering of Short-Time Autocorrelation Sequences”, Speech Communication 28, 1999
[12] J.W. Hung, J.L. Shen, L.S. Lee, “New Approaches for Domain Transformation and Parameter Combination for Improved Accuracy in Parallel Model Combination ( PMC) Techniques”, IEEE Trans. on Speech and Audio Processing, Nov. 2001
[13] J.L. Gauiain and C.H.Lee, “Maximum a Posteriori Estimation for
Multivariate Gaussian Mixture Observations of Markov Chains”, IEEE Trans. on Speech and Audio Processing, 1994
[14] C.J. Leggetter and P.C. Woodland, “Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models”, Computer Speech and Language, 1995.
[15] John, R.Deller, John G.Proaskis, John H.L.Hansen,“Discrete-Time Processing of Speech Signals”.
[16] Y. K. Muthusamy and R. A. Cole, “Automatic Segmentation and Identification of Ten Languages Using Telephone Speech,” in Proc. ICSLP ’92, vol. 2, Oct. 1992, pp.1007-1010
[17] Angel de la Torre,Jose C.Segura, Carmen Benitez, Antonio M. Peinado,Antonio J. Rubio, “Non-Linear Transformation of the Feature Space for Robust Speech Recongnition” IEEE, 2002
[18] Florian Hilger and Hermann Ney, “Quantile Based Histogram Equalization for Noise Robust Speech Recognition”, Eurospeech, 2001
[19] Florian Hilger, Sirko Molau, and Hermann Ney, “Quantile Based Histogram Equalization For Online Application”, ICSLP, 2002
[20] Florian Hilger and Hermann Ney, “Evaluation of Quantile Based Histogram Equalization with Filter Combination on the Aurora 3 and 4 Databases”, EUROSPEECH ,2003
[21] Hemmo Haverinen and Imre kiss, “On-line Parametric Histogram Equalization Techniques for Noise Robust Embedded Speech Recognition”, ASRU, 2001
[22] Sirko Molau, Michael Pitz, and Hermann Ney, “Histogram Based Normalization in the Acoustic Feature Space”, ASRU, 2001
[23] Sirko Molau *, Daniel Keysers, Hermann Ney, “ Matching Training and Test Data Distributions for Robust Speech Recognition”, speech communication, 2003
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top