跳到主要內容

臺灣博碩士論文加值系統

(35.172.136.29) 您好!臺灣時間:2021/08/02 03:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:許順翔
研究生(外文):Shun-Siang Syu
論文名稱:基於隱藏式條件隨機域聲學模型之資源受限裝置語音命令系統
論文名稱(外文):Mixed-Lingual Acoustic Modeling of Hidden ConditionalRandom Field for Resource-constrained Voice Command System
指導教授:洪維廷洪維廷引用關係
學位類別:碩士
校院名稱:元智大學
系所名稱:通訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:39
中文關鍵詞:隱藏式條件隨機域資源受限語音命令系統
外文關鍵詞:Mixed-LingualHidden Conditional RandomResource-constrained
相關次數:
  • 被引用被引用:0
  • 點閱點閱:151
  • 評分評分:
  • 下載下載:1
  • 收藏至我的研究室書目清單書目收藏:0
本論文目的為發展適用於資源受限運算環境,結合華語、英語的混合語言關鍵詞辨識,對於隱藏式條件隨機域華語英語語音模型的差異造成條件相似度的差距,提出權重偏差補償法,並利用光束搜尋法降低搜尋辨識時所佔的記憶體空間,並維持一定的辨識率。
本論文的主要貢獻為,一、深入探討適用於隱藏式條件隨機域之定點演算法,二、提出整合搜尋偏差補償之定點化演算法,三、提出兩階段式權重偏差補償法。實驗證明本論文之方法在不同光束寬度資源受限下,與傳統HMM方法比較能顯出其強健能力的優勢。
The thesis presents the implementation techniques for Mandarin/English mixed-lingual speech recognition kernel under resource-constrained platforms. Considering HCRF-based conditional likelihood-gap between Mandarin and English languages, a two-stage compensation procedure is proposed to solve this bias issue in beam search. Among the related issues and techniques we explore are: (1) To derive the fixed-pointing approach for HCRF-based speech recognition. (2) Integrating the search offset compensation in fixed-pointing algorithm. (3) To explore the needed memory consumption issues in beam search between HMM-based and HCRF-based speech recognition.
目錄
摘 要 I
ABSTRACT II
圖目錄 VI
表目錄 VIII
第一章 序論 1
1.1研究動機與文獻回顧 1
1.1.1研究動機-資源受限系統 1
1.1.2研究動機隱藏式條件隨機域聲學模型 2
1.1.3文獻回顧與背景知識 3
1.2研究目標與論文架構 4
1.3論文大綱 5
第二章 隱藏式條件隨機域演算法介紹 6
2.1隱藏式條件隨機域 6
2.2隱藏式條件隨機域優勢 8
第三章 隱藏式條件隨機域定點化演算法 10
3.1定點化特徵參數擷取 10
3.2隱藏式條件隨機域模型參數定點化 15
3.3整合搜尋偏差補償法之定點化演算法 16
3.3.1隱藏式條件隨機域一般偏差補償方式 17
3.4華英語混合關鍵詞的權重偏差補償法 18
3.4.1第一階段權重偏差補償法 18
3.4.2第二階段權重偏差補償法 19
第四章 實驗分析 22
4.1實驗設備 22
4.2特徵參數 22
4.3華語與英語模型 22
4.4訓練語料 23
4.5測試語料 23
4.6實驗分析 24
4.6.1實驗一: HCRF偏差補償機制比較 25
4.6.2實驗二: HCRF與HMM混合語言辨認效能比較 29
4.6.3實驗三: HCRF與HMM混合語言辨認效能比較 32
第五章 結論 35
參考文獻 37
附錄A 39
參考文獻
[1]P. C. Woodland and D. Povery, ”Large scale discriminative training of hidden Markov models for speech recognition” CSL 2002, vol. 16, 25-47, 2002.
[2]S.-B. Wang, A. Quattoni, L.P. Morency, D. Demirdjian, and T. Darrel, “Hidden condition random fields for gesture recognition”, CVPR 2006, vol. 1, 14-19, 2006.
[3]M. Mahajan, A. Gunawardana, and A. Acero, “Training algorithms for hidden conditional random fields,” ICASSP 2006, vol. 1, 14-19, 2006.
[4]Y.-H Sung, C. Boulis, C. Manning, and D. Jurafsky,”Regularization, adaptation, and non-independent features improve hidden condition random field for phone classification,” ASRU 2007, 347-352, 2007.
[5]J. Lafferty,A. McCallum, and F. Pereira, “Condition random fields:Probabilistic models for segmenting and labeling sequence data,” ICML 2001, 282-289, 2001.
[6]C. Sutton and A. McCallum, “An introduction to conditional random fields for relational learning,” in Introduction to Statistical Relational Learning, 2006.
[7]A. Gunawardana, M. Mahajan, A. Acero, and J. C. Platt, “Hidden condition random fields for phone classification,” ISCA 2005, 1117-1120, 2005.
[8]游鈞顯,“Acoustic modeling of Hidden Conditional Random Field for Speech Recognition” 碩士論文, 元智大學通訊工程研究所, 民國98年1月.
[9]Y.-H Sung, C. Boulis, C. Manning, and D. Jurafsky,”Regularization, adaptation, and non-independent features improve hidden condition random field for phone classification,” ASRU, 347-352, 2007.
[10]S.-B. Wang, A. Quattoni, L.P. Morency, D. Demirdjian, and T. Darrel, “Hidden condition random fields for gesture recognition”, CVPR, vol. 2, 1521-1527, 2006.
[11]C. L. Huang and C-H Wu, ”PHONE SET GENERATION BASED ON ACOUSTIC CONTEXTUAL ANALYSIS FOR MULTILINGUAL SPEECH RECOGNITION,"Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, Taiwan, R.O.C. (2007).
[12]C. L. Huang and C-H Wu , ”Generation of Phonetic Units for mixed-Language Speech Recognition Based on Acoustic and Contextual Analysis,"Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, Taiwan, R.O.C. (2007).
[13]Po-Yi Shih and Jhing-Fa Wang and Hsiao-Ping Lee and Hung-Jen Kai and Hung-Tzu Kao and Yuan-Ning Lin, “Acoustic and Phoneme Modeling Based on Confusion Matrix for Ubiquitous Mixed-Language Speech Recognition,"in SUTC 2008, 500-506.
[14]遊山銳, 簡世傑等人,”中英文混雜關鍵詞萃取技術,"TEPS 2004, 66-79.
[15]F. Soong and E. Huang , “A tree-trellis based fast search for finding the n-best sentence hypotheses,"in ICASSP 1991, 705-708.
[16]X. Lingyun and D. Limin, ”Efficient Viterbi beam search algorithm using dynamic pruning,"in ICOSP 2004, Vol. 1, 699-702.
[17]陳弘啟, “Resource-constrained Mandarin/English Mixed-Lingual Speech Recognition System” 碩士論文, 元智大學通訊工程研究所, 民國98年1月.
[18]H.Hermansky and N.Morgan,“RASTA processing of speech,"in IEEE Transactions SAP, Vol. 2, 578-589.
[19]H. C. Wang and F. Seide and C. Y. Tseng and L. S. Lee, “MAT2000 – Design, collection, and validation on a Mandarin 2000-speaker telephone speech database,"in ICSLP 2000, Beijing.
[20]http://www.aclclp.org.tw/doc/eat_brief.pdf.
[21]T.Svendsen and F.K. Soong and H. Pumhagen, “Optimizing baseforms for HMM-base speech recognition,"In Proceedings of EuroSpeech 1995, 783-786.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top