跳到主要內容

臺灣博碩士論文加值系統

(34.236.36.94) 您好!臺灣時間:2021/07/24 22:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:李秋芬
研究生(外文):Chiou-Fen Li
論文名稱:基於隱藏式條件隨機域聲學模型之強健式訓練演算法
論文名稱(外文):Robust Training Algorithm for Noisy Speech Recognition with Acoustic Modeling of Hidden Conditional Random Field
指導教授:洪維廷洪維廷引用關係
指導教授(外文):Wei-Tyng Hong
學位類別:碩士
校院名稱:元智大學
系所名稱:通訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:63
中文關鍵詞:抗雜訊隱藏式條件隨機域聲學模型語音辨識強健式訓練
外文關鍵詞:HMMHCRFnoisy recognitionRobust Training Algorithm
相關次數:
  • 被引用被引用:0
  • 點閱點閱:111
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出以強健式演算法(Robust Environment-effects Suppression Training,簡稱REST)訓練隱藏式條件隨機域(Hidden Conditional Random Fields,簡稱HCRF)聲學模型,提高HCRF模型對雜訊環境辨識效能,再以鑑別式法則訓練HCRF模型,提高HCRF模型對音節的鑑別力。以混合雜訊的訓練語料訓練出HCRF與HMM聲學模型,在人聲雜訊干擾下,HCRF模型與HMM模型辨識能力比較,HCRF模型錯誤改善率可達18.1%;在有通道效應的測試語料MOSTK辨識結果,HCRF模型比HMM模型有21.2% 錯誤改善率。本論文之主要貢獻為下列三項:
1. 提出基於HCRF聲學模型雜訊補償機制。
2.? 運用REST演算法來提高HCRF模型對雜訊環境的辨識效能。
3.? 經D-REST演算法訓練出的HCRF模型,同時兼具抗背景雜訊與增加音節鑑別力。
In coordination with the robust training and discriminative training technique, a novel algorithm is proposed in this thesis for generating a set of compact hidden conditional random fields (HCRF)-based acoustic models. Among the related issues and techniques we explore are:

1. Derive the compensation operations with HCRF-based models for noise and channel bias distorted conditions.

2. Apply the robust training algorithm for noisy speech recognition with HCRF-based models.

3. Apply the discriminative training technique with HCRF-based models under multi-conditions training database for adverse speech recognition.
中文摘要 I
英文摘要 II
致謝 III
目錄 IV
圖目錄 VII
表目錄 IX
第一章 序論 1
1.1 研究動機與文獻回顧 1
1.2 研究概述 3
1.3 論文大綱 4
第二章 演算法介紹 5
2.1 隱藏式條件隨機域 5
2.2 最小分類錯誤演算法 7
2.3 鑑別式強健演算法 9
2.3.1 訊號偏移消去法 12
2.3.2 狀態式溫尼濾波器 13
2.3.3 平行模型整合 14
第三章 隱藏式條件隨機域之訓練與模型補償 16
3.1 前言 16
3.2 HCRF之鑑別式強健訓練演算法 16
3.3 HCRF模型補償演算法 19
3.3.1 固定 、 之HCRF模型補償 21
3.3.2 固定 之HCRF模型補償 22
3.3.3 固定 之HCRF模型補償(1) 22
3.3.4 固定 之HCRF模型補償(2) 23
第四章 實驗分析 24
4.1 實驗設定 24
4.1.1 訓練語料 24
4.1.2 測試語料 26
4.1.3 添加雜訊介紹 27
4.2 訓練流程 28
4.3 測試流程 29
4.4 HCRF模型補償之測試 30
4.5 效能分析 32
4.5.1 雜訊的影響 32
4.5.2 訓練語料的影響 35
4.5.3 測試語料的影響 38
4.5.4 整體辨識分析 42
第五章 結論 44
參考文獻 45
附錄一 48
附錄二 56
[1]L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE on speech recognition, vol. 77, no. 2, pp. 257-286, 1989.
[2]A. P. Varga, R. K. Moore, “Hidden Markov model decomposition of speech and noise,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. 845-848, 1990.
[3]Saeed V. Vaseghi, Ben P. Milner, “Noise Compensation Methods for Hidden Markov Model Speech Recognition in Adverse Environments,” IEEE Trans. Speech and Audio Processing, vol. 5, no. 1, 1997.
[4]Ivandro Sanches, Member, “Noise-compensated hidden Markov models,” IEEE Trans. Speech and Audio Processing, vol. 8, no. 5, pp. 533-540, 2000.
[5]Wei-Tyng Hong, Sin-Horng Chen, “A robust training algorithm for adverse speech recognition,” Speech Communication, vol. 30, no. 4, pp. 273-293, 2000.
[6]Masaki Ida, Satoshi Nakamura, “Rapid environment adaptation method based on HMM composition with prior noise GMM and multi-SNR models for noisy speech recognition,” IEICE Transactions on Information and Systems, Pt.2 (Japanese Edition), vol. J86, no. 2, pp. 195-203, 2003.
[7]Takatoshi Jitsuhiro, Tomoji Toriyama, Kiyoshi Kogure, “Robust speech recognition using noise suppression based on multiple composite models and multi-pass search,” Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop, pp. 53-58, 2007.
[8]Javier Ramírez, José C. Segura, “An effective subband OSF-based VAD with noise reduction for robust speech recognition,” Speech and Audio Processing, IEEE Transactions on, vol. 13, no. 6, pp. 1119-1129, 2005.
[9]Wei-Tyng Hong, Sin-Horng Chen “A robust RNN-based pre-classification for Noisy Mandarin speech recognition,” EUROSPEECH ''97 5th European Conference on Speech Communication and Technology, pp. 22-25, 1997.
[10]S. Sagatama, Y. Yamaguchi, S. akahashi, “Jacobian adaptation of noisy speech models,” IEEE Workshop on Automatic Speech Recgnition and Understanding, pp. 396-403, 1997.
[11]A. Acero, R. M. Stern, “Robust Speech Recognition by Normalization of the Acoustic Space,” ICASSP-91, pp. 893-896, May, 1991.
[12]D. Pearce, H. Hirsch, “The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions,” in Proc. ICSLP, pp. 29-32, 2000.
[13]A. Quattoni, S. Wang, L.-P Morency, M. Collins, T. Darrell, “hidden conditional random fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1848-1852, 2007.
[14]J. Zhang, S. Gong, “Action categorization with modified hidden conditional random field,” Pattern Recognition, 2009.
[15]A. Gunawardana, M. Mahajan, A. Acero, J. C. Platt, “Hidden Conditional Random Fields for Phone Classification,” INTERSPEECH, 2005
[16]游鈞顯,“應用於語音辨認之隱藏式條件隨機域,” 元智大學 / 通訊工程學系 / 97 / 碩士 / 47頁(2009).
[17]B.-H. Juang, S. Katagirl, “Discriminative learning for minimum error classification,” IEEE Trans. Signal Processing, vol. 40, pp. 3043-3054, 1992.
[18]W. Chou, B. H. Juang, C. H. Lee, “Segmental GPD Training of HMM-Based Speech Recognizer,” IEEE International Conference on Acoustics Speech and Signal Processing, vol. 1, pp. 473-476, 1992.
[19]Wei-Tyng Hong, Sin-Horng Chen, “A robust training algorithm for adverse speech recognition” Speech Communication, vol. 30 no. 4, pp. 273-293, 2000.
[20]Wei-Tyng Hong, “A discriminative and robust training algorithm for noisy speech recognition,” Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ''03). 2003 IEEE International Conference, vol.1, pp. I-8- I-11.
[21]Leroux, B.G., “Maximum-likelihood estimation for hidden Markov models,“ Stochastic Processes and their Applications, vol. 40, pp. 127–143, 1992.
[22]Main G. Rahim, Member, IEEE, Biing-Hwang Juang, Fellow, IEEE “Signal bias removal by maximum likelihood estimation for robust telephone speech recognition,” Speech and Audio Processing, IEEE Transactions, vol. 4, no. 1, pp. 19-30, 1996.
[23]A. Chen, S. Vaseghi, P. McCourt, “State based sub-band LP Wiener filters for speech enhancement in car environments,” Acoustics, Speech, and Signal Processing, 2000. (ICASSP ''00). Proceedings. IEEE International Conference, vol. 1, pp. 213-216, 2000.
[24]M.J.F. Gales, S.J. Young, “Robust continuous speech recognition using parallel model combination,” Speech and Audio Processing, IEEE Transactions, vol. 4, no. 5, pp. 352-359, 1996.
[25]HUNG Jeih-Weih, SHEN Jia-Lin, LEE Lin-Shan, “New approaches for domain transformation and parameter combinationfor improved accuracy in parallel model combination (PMC) techniques,” Speech and Audio Processing, IEEE Transactions, vol. 9, no. 8, pp. 842-855, 2001.
[26]王小川“語音訊號處理” 全華科技出版社.
[27]http://spib.rice.edu/ , Signal Processing Information Base.
[28]“Ambient Noise Database for Telephonometry 1996,” NTT Advanced Technology Corporation.
[29]H. C. Wang, F. Seide, C. Y. Tseng, L. S. Lee, “MAT2000 – Design, collection, and validation on a Mandarin 2000-speaker telephone speech database,” ICSLP, Beijing, China, pp. 460-463, 2000.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top