跳到主要內容

臺灣博碩士論文加值系統

(3.235.227.117) 您好!臺灣時間:2021/08/01 21:54
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:游鈞顯
研究生(外文):Chun-Hsien Yu
論文名稱:應用於語音辨認之隱藏式條件隨機域聲學模型研究
論文名稱(外文):Acoustic modeling of Hidden Conditional Random Field for Speech Recognition
指導教授:洪維廷洪維廷引用關係
學位類別:碩士
校院名稱:元智大學
系所名稱:通訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:47
中文關鍵詞:語音辨認隱藏式條件隨機域連續音節辨識
外文關鍵詞:HCRFASR
相關次數:
  • 被引用被引用:0
  • 點閱點閱:217
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本文討論使用隱藏式條件隨機域(Hidden Conditional Random Field, 簡稱HCRF)於語音辨識之聲學模型,與傳統之隱藏式馬可夫模型(Hidden Markov Model, 簡稱HMM)進行分析比較,並提出一個結合鑑別式法則之新穎HCRF模型訓練方法。經由TEST500語音資料庫進行連續音節辨識的實驗結果,發現HCRF有較佳的辨識率,其辨認反應時間遠快於HMM,更適合運用於即時辨識。此外,對於HCRF模型訓練方面,比較鑑別式法則與傳統的最大相似度法則,發現採用鑑別式訓練法則的HCRF模型較具有鑑別力。我們利用鑑別式法則訓練HMM至收斂,將其參數轉換成HCRF初始參數,並繼續使用鑑別式法則訓練HCRF模型,得到最佳的HCRF聲學模型,其效能相較於最大相似度法則訓練出的HMM,提高了 10.7%相對音節正確率。本文同時探討在定點化的特徵參數與聲學模型情況下,HCRF與HMM相比,HCRF不論是反應時間與音節正確率皆優於HMM,並在人名辨識的實驗中,搭配光束搜尋法,也得到不錯的效果。
In this thesis, we adopt an acoustic modeling with Hidden Conditional Random Field (HCRF)-based approach for speech recognition; and its performance is compared with the traditional Hidden Markov Model (HMM) in the same structure. A novel HCRF training algorithm combining the discriminative training criterion is proposed. In comparison with the performance of the continuous Mandarin syllable recognition in TEST500 database, the HCRF-based approach is better than the one obtained with HMM in the accuracy rate and response time. Proved by a serial of related experiments, we think HCRF is more suitable for real-time speech recognition system. Next, we compare two methods for training HCRF. One is based on maximum likelihood criterion; the other is based on discriminative criterion. These results indicate that the discriminative approach outperforms the training scheme in maximum likelihood criterion. Finally, we investigate our HCRF-based system in fixed-point and limited beam-size issues. The related experimental results show again the advantages of the HCRF-based approach in this thesis.
中文摘要 I
英文摘要 II
謝辭 III
目錄 IV
圖目錄 VI
表目錄 VII
第一章 序論 1
1.1 研究動機 1
1.2 文獻回顧 2
1.3 研究目標 7
1.4 研究概述 8
第二章 演算法介紹 9
2.1 隱藏式馬可夫模型 9
2.2 隱藏式條件隨機域 12
第三章 參數估測 17
3.1 參數估測 17
3.2 最大相似度估測法 17
3.3 最小分類錯誤演算法 20
第四章 實驗分析 26
4.1 實驗設定 26
4.2 國語語音資料庫(MAT2000) 26
4.3 系統流程 27
4.4 連續音節辨認 28
4.4.1 系統效能比較 29
4.4.2 辨認反應時間 31
4.5 定點化實驗 33
4.5.1 模型參數定點化 34
4.5.2 拉普拉斯分布 35
4.5.3 連續音節辨認 37
4.5.4 人名辨認 40
第五章 結論 44
參考文獻 46
[1]P. C. Woodland and D. Povey, “Large scale discriminative training of hidden Markov models for speech recognition,” CSL 2002, vol. 16, 25–47, 2002
[2]B.-H. Jaung and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Transactions of Signal Processing, 1992, vol. 40, issue 12, 3043-3054.
[3]D. Povey and P. C. Woodland, “Minimum phone error and I-smoothing for improved discriminative training,” ICASSP 2002, vol. 1, 105-108, 2002.
[4]D. Povey, Discriminative Training for Large Vocabulary Speech Recognition, Ph.D. thesis, Cambridge University, 2003.
[5]H-K. Kuo and Y. Gao, “Maximum entropy direct models for speech recognition,” ASRU 2003, 1-6, 2003.
[6]A. Gunawardana, M. Mahajan, A. Acero, and J. C. Platt, “Hidden conditional random fields for phone classification,” ISCA 2005, 1117-1120, 2005.
[7]J. Lafferty, A. McCallum, and F. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” ICML 2001, 282–289, 2001.
[8]W. Chou, B.-H. Juang, and C.-H. Lee, “Segmental GPD training of HMM-based speech recognition,” ICASSP 1992, 473-476, 1992.
[9]C. Sutton and A. McCallum, “An introduction to conditional random fields for relational learning,” in Introduction to Statistical Relational Learning, 2006.
[10] H.-C. Wang, F. Seide, C.-Y. Tseng, and L.-S. Lee, “MAT2000 – Design, collection, and validation on a Mandarin 2000-speaker telephone speech database,” in ICSLP 2000, Beijing, 2000.
[11] W. Chou, B.-H. Juang, Pattern recognition in speech and language processing. CRC Press, 2003
[12] S.-B. Wang, A. Quattoni, L.P. Morency, D. Demirdjian, and T. Darrell, “ Hidden conditional random fields for gesture recognition”, CVPR 2006, vol. 2, 1521-1527,2006
[13] M. Mahajan, A. Gunawardana, and A. Acero, “Training algorithms for hidden conditional random fields,” ICASSP 2006, vol. 1, 14-19, 2006.
[14] Y.-H. Sung, C. Boulis, C. Manning, and D. Jurafsky, “Regularization, adaptation, and non-independent features improve hidden conditional random fields for phone classification,” ASRU 2007, 347-352, 2007.
[15] N. Atsushi , “Acoustic Modeling for Speech Recognition Based on a Generalized Laplacian Mixture Distribution,” in Electronics and Communications in Japan, vol. 85, issue 11, 32-42, 2002.
[16] X. Lingyun and D. Limin,“Efficient Viterbi beam search algorithm using dynamic pruning,” in ICSP 2004, vol. 1, 699-702.
[17] J.-J Jang and S.-S Lin, “Optimization of Viterbi Beam Search in Speech Recognition,” in ISCSLP 2002, paper 114.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top