跳到主要內容

臺灣博碩士論文加值系統

(3.236.124.56) 您好!臺灣時間:2021/07/31 06:42
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:李安基
研究生(外文):An-Chi Li
論文名稱:在特定文字語者驗證系統之雜訊消除
論文名稱(外文):Noise Reduction for Text Dependent Speaker Verification System
指導教授:歐陽彥杰
學位類別:碩士
校院名稱:國立中興大學
系所名稱:電機工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:中文
論文頁數:58
中文關鍵詞:通道雜訊消除倒頻譜加權
外文關鍵詞:Cepstral Mean Subtraction (CMS)Cepstral Weighting(CW)
相關次數:
  • 被引用被引用:0
  • 點閱點閱:100
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
語者驗證系統的發展已經漸趨成熟,而其運用的範圍也愈來愈大,其中在辨識率的提升一直是語音辨識所極需發展的重點。在本論文當中,將從降低聲音的雜訊中來提升辨識率。首先,我們經由前置處理將我們的聲音樣本去除多餘不需要的資訊,再來我們利用線性預測參數和它的一階倒頻譜(LPCC)與梅爾倒頻譜參數(MFCC)合併,取出我們需要的聲音特徵值。
接下來我們提出了兩個降低聲音雜訊的辦法,一個是通道雜訊消除(Cepstral Mean Subtraction ,CMS),一個則是倒頻譜加權(Cepstral Weighting ,CW),將這兩個程式加入到LPCC與MFCC之中,去除得到聲音特徵值當中的雜訊。最後比較處理後的特徵值與未處理的特徵值在使用隱藏式馬可夫模型後,系統效能的優劣。
The development of speaker verification system become maturely and its application become extension of the scope. To raise the recognition rate is the key point of the speech recognition. In this thesis, we use many noise reduce methods to reduce the noise of speech and to raise the recognition rate.
Two major methods were need in thesis to reduce the noise for test dependent speaker verification. The speaker verification experiment was conducted. The speech signals were taken from the MMLab database, NCHU. 100 speaker(50 males, 50 females) were need in the test. The tests show that using cepstral mean subtraction(CMS) noise reduction method can effectively increase the speaker verification rate. Adding the cepstral weighting(CW) noise reduction method can improve the verification performance.
第一章 1
緒論 1
1.1 研究動機 1
1.2 語音辨識技術的演進 2
1.3 語者辨識概論 3
1.4 聲音產生原理 5
1.5 論文大綱 6
第二章 7
語音辨識的理論基礎 7
2.1 何謂語音辨識 7
2.2 語者驗證模型 9
2.2.1 隱藏式馬可夫模型(Hidden Markov Model,HMM) 9
2.2.2 高斯混合模型(Gaussian mixture model,GMM) 10
2.2.3 動態時軸校準(Dynamic time warping,DTW) 10
第三章 11
語音辨識演算法 11
3.1 語音的前端處理 11
3.1.1 語音訊號取樣(Sampling) 12
3.1.2 移除直流偏移(DC-offset Removal)12
3.1.3 帶通濾波器(Band Pass Filter) 12
3.1.4 音框化(Frame Blocking) 13
3.1.5 端點偵測(Endpoint Detection) 14
3.1.6 音量量化(Volume Normalization) 16
3.1.7 預強調(Pre-emphasize) 16
3.1.8 視窗函數(Windowing) 16
3.2 特徵參數擷取(Parameter Extraction) 18
3.2.1 線性預估參數 18
3.2.2 倒頻譜係數(Cepstrum Coefficient) 20
3.2.3 一階倒頻譜係數(Delta-Cepstrum Coefficient) 21
3.2.4 梅爾倒頻參數 22
3.2.4.1 梅爾頻率的三角形濾波器 22
3.2.5 合併LPCC和MFCC特徵值 24
3.3 隱藏式馬可夫模型(Hidden Markov Models,HMM) 25
3.3.1 正算程式(The Forward Procedure) 27
3.3.2 逆算程式(The Backward Procedure) 28
3.3.3 維特比演算法(The Viterbi Algorithm) 29
第四章 30
雜訊的降低與系統的建立 30
4.1 通道雜訊消除(Cepstral Mean Subtraction,CMS ) 30
4.2 倒頻譜加權(Cepstral Weighting,CW) 32
4.3 系統架構 35
4.4 測試流程 40
第五章 42
實驗結果與分析 42
5.1 系統參數的設定 42
5.2 最佳狀態數的測試 45
5.3 加入各參數的模擬比較 48
5.3.1 未加入任何參數的模擬結果 48
5.3.2 加入CMS後模擬結果 49
5.3.3 加入CW後模擬結果 50
5.3.4 加入CMS+CW後模擬結果 51
5.3.5 综合比較 52
第六章 55
結論與未來展望 55
6.1 結論 55
6.2 未來展望 55
參考文獻 56
[1] L. Rabiner, and B.H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993
[2] T.T. Phan and T. Soong “Text-Independent Speaker Identification” December 8, 1999
[3] C.T Heieh, E. Lai and Y.C. Wang “Robust Speaker Identification System Based on Wavelet Transform and Gaussian Mixture Model,” Journal of Information Science and Engineering 19, 267-282(2003)
[4] Q.L. Augustine Tsai, and W.G. Kim, “A Language Independent Personal Voice Controller with Embedded Speaker Verification,” In 6th European Conf. Speech Communication & Technology Proc., Budapest, Hungary, vol. 3,pp. 1207-1210, Sept. 1999.
[5] M. Stengel, “Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks,” Toyohashi, 441-8580 Japan March 7th, 2003
[6] D.A. Reynolds, A Gaussian Mixture Modeling Approach to Text-Independent Speaker Identification, Ph.D. Thesis, Georgia Institute of Technology, Atlanta, GA, 1992
[7] D.A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Communication, vol. 17, pp. 91-108, Aug. 1995
[8] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Processing, col. ASSP-26, no. 1,pp. 43-49, 1978.
[9] A. Higgins, L. Bhaler, and J. Porter, “Voice identification using randomized phrase prompting,” Digital Signal Processing, vol.1, no. 2, pp. 89-106, 1991.
[10] F. Soong, A. Rosenberg, L. Rabiner, and B. Juang, “A Vector Quantization Approach to Speaker Recognition,” Proc. Int. Conf. Acoustics, Speech, and Signal Processing, vol. 1, pp.387-390, Tampa, FL, 1985
[11] R. J. Mammone, X. Zhang and R. P. Ramachandran, “Robust speaker recognition: A feature based approach,” IEEE Signal Processing Mag., vol. 13, pp.58-71, 1996.
[12] Z. X. Yuan, B.L. Xu and C. Z. Yu, “Binary quantization of feature vectors for robust text-independent speaker identification,” IEEE Tran. On Speech and Audio Processing, vol.7, no. 1, Jan. 1990
[13] C. Kermorvant “A comparison of noise reduction techniques for robust speech recognition” IDIAP-RR 99-10
[14] D. A. Reynolds, Member, IEEE, and Richard C. Rose, Member, IEEE “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models.”
[15] B.H. Juang L.R. Rabiner, and J.G. Wilpon, “On the Use Bandpass Filtering in Speech Recognition,” IEEE Trans. Acoustics, Speech, And Signal Processing, vol. 35, No.7, pp.947-954, July 1987.
[16] 陳明熒,”PC電腦語音辨識實作”,旗標出版,民83,台北市
[17] J.P. Campbell, “Speaker Recognition: A Tutorial, ”Proc. IEEE, vol. 85, no. 9, pp.1437-1462, Sept. 1997
[18] R. Jang (張智星) http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp
Audio Signal Processing and Recognition (音訊處理與辨識)
[19] 王小川,”語音訊號處理”,全華出版,2005年2月

[20] G., Ben and N. Morgan. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley and Sons, Inc: New York. 2000.
[21] 楊鎮光,” Visual Basic與語音辨識”,松崗出版,pp3-34-36,2002年6月
[22] J.G. Rodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In Gmm Speaker Recognition System for Noisy and reverberant Speech with Low Complexity Microphone Arrays”
[23] A. Acero and X. Huang “Augmented Cepstral Normalization for Robust Speech Recognition”
[24] IMAI S.: Cepstral analysis on the mel frequency scale. –In: Proceedings ICCASSP-83,1983, pp.93-96
[25] Z. Tychtl and J. Psutka, “Speech Production Based on the Mel-Frequency Cepstral Coefficients,” No. VS 97159, and by the Grant gency of the Czech Republic-project No.102/96/K087.
[26] S.B. Davies and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. ASSP-28, no. 4, 99. 357-366,Aug. 1980.
[27] H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” IEEE, pp.117-120,2001.
[28] Rabiner,L. and B.H. Juang, “Fundamentals of Speech Recongnition”,
Prentice-Hall,1993
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top