(3.234.221.67) 您好!臺灣時間:2021/04/11 16:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:曹智欣
研究生(外文):Zhi-Hsin Tsao
論文名稱:使用說話速度與耦合效應分類之華語連續音節辨識
論文名稱(外文):The Use of Speaking Rate and Coarticulation classifications in Continuous Mandarin Speech Recognition
指導教授:陳信宏陳信宏引用關係
指導教授(外文):Sin-Horng Chen
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電信工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:中文
論文頁數:41
中文關鍵詞:說話速度耦合
外文關鍵詞:speaking ratecoarticulation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:122
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出兩種將語音信號分類,以建立各類別聲韻母辨認模型,來改進華語連續語音辨認之方法。其一為依語者說話速度分成快、正常、慢三類,分別建立三組100個 RCD聲母及40 個CI韻母HMM模型;另一為依音節間耦合程度分成耦合、一般、無耦合三類,對耦合類建立219個縮短之合併聲韻母HMM模型,對另兩類則分別建立兩組100個RCD聲母及40個CI韻母HMM模型。由對MAT-4500之電話語料庫之實驗結果得知,兩種方法均能稍為改進整體之音節辨認率,而對速度較快及較慢或是耦合及無耦合類之語音則有較大幅度的改進,因此它們均為有效的方法。

In this thesis, two signal classification-based acoustic modeling methods for continuous Mandarin speech recognition are proposed. One is to firstly classify all speech signals into three classes of fast, normal and slow, and then construct separately for each class a set of 100 right-context-dependent (RCD) initial and 40 context-independent (CI) final HMM models. Another is to firstly classify all signals of final-initial pairs into highly coarticulated, normal, and non-coarticluated classes, and then construct 219 contracted, integrated final-initial HMM models and two sets of 100 RCD initial and 40 CI final HMM models for them. Experimental results based on the MAT-4500 telephone-speech database confirmed that both methods slightly improved the syllable recognition rate. By detailed analyses, we found that both speech classes of fast and slow or highly coarticulated and non-coarticluated had more significant performance improvements. So they are promising acoustic modeling methods.

第一章 序論..............................................1
1.1 研究動機..........................................1
1.2 研究方向..........................................1
1.3 章節概要..........................................2
第二章 語者辨識之基本系統................................3
2.1 說話速度的界定....................................4
2.2 HMM模型的建立與評估...............................6
2.2.1 以句子為單元的速度分類HMM建立..................7
2.2.2 以音節為單元的速度分類HMM建立..................9
2.3 實驗結果與分析...................................11
2.3.1 語音辨識基本架構..............................13
2.3.2 說話速度辨識系統..............................14
2.3.3 實驗結果......................................15
2.3.4 實驗結論......................................17
第三章 耦合效應.........................................19
3.1 耦合模型的選取與建立..............................19
3.1.1 耦合與無耦合模型的選取........................20
3.1.2 耦合與無耦合模型的建立........................21
3.2 耦合效應模型之訓練與分析.........................22
3.3 實驗結果與分析...................................26
3.3.1 實驗結果......................................27
3.3.2 實驗分析與結論................................30
第四章 結論及未來展望...................................32
參考文獻................................................34
附錄一..................................................36
附錄二..................................................38

[1] Wen-Hsing Lai and Sin-Horng Chen, “A Novel Syllable
Duration Modeling Approach for Mandarin Speech”,
Proc.ICASSP 2001.
[2] Wen-Hsing Lai and Sin-Horng Chen, “Analysis of Syllable
Duration Models for Mandarin Speech”, Proc. ICASSP 2001.
[3] Nelson Morgan and Eric Fosler-Lussier, “Combining Multiple
Estimator of Speaking Rate”, Proc. ICASSP 1998.
[4] T. Pfau, G. Ruske, “Estimating the Speaking rate by Vowel
Detection”, Proc. ISCLP 1998.
[5] M. Richardson ,M. Hwang, A. Acero, X.D. Huang, “On-line
Speaking Rate Estimation Using Gaussian Mixture Models”,
Proc. ICASSP 2000, IEEE, Vol3, S. 1355-1358.
[6] Nikki Mirghafori Eric, Fosler Nelson Morgan, “Towards
Robustness to Fast Speech in ASR”, Proc. ICASSP 1996.
[7] M. Richardson, M. Hwang, A. Acero, X.D. Huang, ”
Improvements on Speech Recognition for Fast Talkers”,
Proc. Eurospeech 1999.
[8] Hiroaki Nanjo and Tatsuya Kawahara, “Speaking-rate
Dependent Decoding and Adaptation for Spontaneous Lecture
Speech Recognition”, Proc. ICASSP 2000.
[9] Thilo Pfau, Robert Faltlhauser, Gunther Ruske, “Speaker
Normalization and Pronunciation Variant Modeling : Helpful
Methods for Improving Recognition og Fast Speech”, Proc.
Eurospeech 1999.
[10]H. Nanjo, and T. Kawahara, “Speaking Rate Dependent
Acoustic Modeling for Spontaneous lecture Speech
Recognition”, Proc. Eurospeech 2001, pp2531-2534.
[11]C. H. Lee, C. H. Lin, B.H. Juang, “A Study on Speaker
Adaptation of the Parameters of Continuous Density Hidden
Markov Models”, IEEE trans. Acous., Speech, Signal
Proc. ,Vol.39, pp806-814,1991.
[12]T. Pfau, G. Ruske, “Creating Hidden Markov Models for Fast
Speech”, Proc. ISCLP 1998, pp. 205-208
[13]呂儲仰,”國語連續音節辨認系統之改進與分析”,國立交通大學碩
士論文,民國九十一年六月。
[14]林秉正,“使用適應性區間模型於語者說話速度之調整”,國立成功
大學碩士論文,民國九十一年七月。

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔