(3.238.96.184) 您好!臺灣時間:2021/05/18 15:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:江佩璇
研究生(外文):Pei-Shiuan Jiang
論文名稱:三重混合模型和倒頻譜正規化技術之強健語音辨識
論文名稱(外文):Triple Hybrid Model And Cepstral Statistics Normalization Techniques For Robust Speech Recognition
指導教授:吳俊德吳俊德引用關係
指導教授(外文):Gin-Der Wu
口試委員:洪志偉莊家峰
口試日期:2011-07-06
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:英文
論文頁數:44
中文關鍵詞:碼簿(codebook)線性鑑別分析(LDA)主成分分析(PCA)最小錯誤鑑別式(MCE)
外文關鍵詞:codebookLinear Discriminant Analysis, (LDA)Principal Comonents Analysis, (PCA)Minimum Classification Error, (MCE)
相關次數:
  • 被引用被引用:0
  • 點閱點閱:148
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
此篇論文分別利用兩種方式強健語音辨識系統,一是結合不同模型之優點對倒頻譜參數降維,二是藉由正規化語音特徵統計特性來降低雜訊造成的影響。
第一部分,結合了線性鑑別分析(linear discriminant analysis, LDA)、主成分分析(principal comonents analysis, PCA)、最小錯誤鑑別式(minimum classification error, MCE)三種方法,利用此三種方式之優點,訓練模型來強健語音資料的特徵參數。
第二部分,是以倒頻譜平均消去法(cepstral mean subtraction, CMS)、倒頻譜平均值與辨異數正規化法(cepstral mean and variance normalization, CMVN)為基礎,使用只含語音部分的特徵來建構碼簿(codebook),同時對每個碼字(codeword)給予權重值(weight)。另外也整合上述之碼簿式與整段式得到的統計資料,發展出組合式特徵參數正規法。最後利用高斯混合模型(gaussian mixture model, GMM) 建立語音模型。最後再加以比較、討論。

This thesis is investigated in two ways to reach enhance the recognition rates respectively, one is to combine different modes which have different advantages, and the other is cepstral statistics normalization techniques to reduce noise effect.
The first part of thesis combines Linear Discriminant Analysis,(LDA), Principal Comonents Analysis,(PCA), and Minimum Classification Error,(MCE) to train model and reach robust the speech feature.
The second part of thesis uses cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN) as basic. And normalization techniques use the characteristic parameter which is speech-only to construct codebook. Normalization techniques give weight for every codebook. Then codebook and utterance information are combined in cepstral statistics for normalization. And template matching employs Gaussian Mixture Model, (GMM). Finally, this way compares and discusses the results which are tested in several variable background noises form different conditions.


Acknowledgments i
Abstract in Chinese ii
Abstract in English iii
Contents…. iv
List of figures vi
List of tables vii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Introduction of feature extraction 1
1.3 Introduction of hybrid model 2
1.4 Introduction of normalization techniques 2
1.5 Thesis organization 3
Chapter 2 Modified two-dimensional Cepstrum 4
2.1 Introduction 4
2.2 Pre-emphasis 7
2.3 Hamming Window 7
2.4 Discrete Fourier Transform (DFT) 8
2.5 Mel Filter Banks 8
2.6 IIR High Pass Filter and Full Wave Rectifier 9
2.7 Logarithm Transform 10
2.8 Discrete Cosine Transform (DCT) 10
2.9 IDFT and Take Real Part Coefficients 11
Chapter 3 Recognition algorithm 12
3.1 Gaussian Mixture Model [14] 12
3.2 Expectation Maximization (EM) [15] 13
3.3 Speech Recognition 14
Chapter 4 Triple hybrid model for PCA, LDA and MCE 16
4.1 Principal Component Analysis (PCA) 16
4.2 Linear Discriminant Analysis (LDA) 18
4.3 Minimum Classification Error (MCE) 20
4.4 Combination of PCA, LDA and MCE 21
Chapter 5 Cepstral Statistics Normalization Techniques 23
5.1 Utterance-based CMS (U-CMS) and Utterance-based CMVN (U-CMVN) 23
5.2 Codebook-based CMS (C-CMS) and Codebook-based CMVN (C-CMVN) 24
5.3 Hybrid-based cepstral statistics normalization techniques [16] 27
Chapter 6 Experiments and Results 29
6.1 System Specification 29
6.2 The Experiment Results of Hybrid Model (LDA+PCA+MCE) 31
6.3 The Experiment Results of Normalization Techniques 34
6.3.1 The Experiment Results of baseline 34
6.3.2 The Experiment Results of Utterance-based Normalization Techniques 35
6.3.3 The Experiment Results of Codebook-based Normalization Techniques 37
6.3.4 The Experiment Results of Hybrid-based Normalization Techniques 38
6.3.5 The Experiment Results of U-CMS, C-CMS and CU-CMS 40
6.3.6 The Experiment Results of U-CMVN, C-CMVN and CU-CMVN 40
Chapter 7 Conclusion and Future Work 42
Bibliography 43

[1]Wu, G.D. “Chip Design of LPC-ceqstrum for Speech Recognition,” 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007)
[2]Ricotti, L.P. “Multitapering and a wavelet variant of MFCC in speech recognition,” IEEE Proc.-Vis. Image Signal Process, Vol. 152, No. 1 pp. 29-35, February 2005
[3]Pai, H. F. “Two-Dimensional Cepstral Distance Measure for Speech Recognition,” Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on Vol. 2, pp. 672 - 675 1993
[4]Lin, C.T. “Ga-Bassed Noisy Speech Recognition Using Two-Dimensional Cepstrum,” IEEE Transactions on Speech and Audio Processing, Vol. 8, No. 6, pp. 664-675, Nov. 2000
[5]R.C.T Lee, "Application of principal Compenent Analysis to Multikey Ssearching," IEEE Transactions on Software Engineering, Vol. SE-2, No. 3,pp.185-193, 1976.
[6]J-W. Hung, "Optimization of temporal filters for constructing robust features in speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, Vol.14, No. 3, pp.808-832,May 2006.
[7]J. Ye "A Two-Stage Linear Discriminant Analysis via QR-Decomposition," IEEE Transactions on Analysis and Machine Intelligence, Vol. 27, pp.929-941,No.6, June 2005.
[8]Fu, M.Q. Juang, B.H. Zhou, J.L. and Soong, F.K. “Generalization of the minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP),” in Proceedings of the International Conference on Speech and Language Processing, (Pittsburgh, PA), Sep. 2006.
[9]S. Furui, “Cepstral Analysis Technique For Automatic Speaker Verification ,”IEEE Trans. Acoust. , Speech, Signal Process. ,vol. ASSP-29, no. 2, Apr. 1981.
[10]C.-P.Chen, K.Filality, and J.A. Bilmes, “Frontend Post-processing And Backend Model Enhancement On The Aurora 2.0/3.0 databases,” in Proc. Int. Conf. Spoken Lang. Process.(ICSLP), 2002.
[11]F. Hilger and H. Ney, “Quantile Based Histogram Equalization For Noise Robust Speech Recognition,”in Eur Conf. Speech Communication and Technology (Eurospeech), 2001.
[12]C.-W. Hsu and L.-S.Lee, “Higher Order Cepstral Moment Normalization(HOCMN) For Robust Speech Recognition,”in Proc. IIEEE Int. Conf. Acoust., Speech, Signal Process.(ICASSP), 2004.
[13]Pai, H. F. “Two-Dimensional Cepstrum and its application to Mandarin speech recognition,” Ph.D. dissertation, Inst. Elect. Eng.,Nat. TsingHua Univ., Taiwan, R.O.C., 1993.
[14]Reynolds, D.A. “Robust text-independent speaker identification using Gaussian mixture speaker models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, pp. 72-83, Jan. 1995.
[15]Pai, H. F. “Two-Dimensional Cepstrum and its application to Mandarin speech recognition,” Ph.D. dissertation, Inst. Elect. Eng.,Nat. TsingHua Univ., Taiwan, R.O.C., 1993.
[16]J-W. Hung, “Incorporating Codebook and Utterance Information in Cepstral Statistics Normalization Techniques for Robust Recognition in Additive Noise Environments,” IEEE Signal letters, Vol.16, No.6, June 2009.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
1. 吳錦惠(2005)。新臺灣之子的教育問題與課程調適之研究。國立臺南大學教育學系課程與教學碩士論文,未出版,台南。
2. 王瑞壎(2004)。大陸和外籍新娘婚生子女適應與學習能力之探究,臺灣教育,626,25-31。
3. 徐世瑜(2000)。小班教學精神的理論與實務。臺北市立師範學院學報,31,93-103,
4. 夏曉鵑(1997)。女性身體的貿易:台灣/印尼新娘貿易的階級、族群關係與性別分析。騷動季刊,4,10-21。
5. 胡夢鯨(1995)。台灣地區城鄉國民小學教育資源分配之比較。國立中正大學學報,6(1),1-35。
6. 林清山(1998)。多元羅吉式迴歸係數的最大可能性估計、顯著性考驗以及多元羅吉式迴歸分析模式的適合度考驗。測驗年刊, 45(1),181- 200。
7. 李輝(1994)。影響國小兒童自我觀念發展之重要因素。國民教育,34,6-9。
8. 李詠吟(2001)。加州小班教學的實施及其對台灣教育的啟示。文教新潮,6(1),17-27。
9. 巫有鎰(1999)。影響國小學生學業成就的因果機制-以台北市和台東縣做比較。教育研究集刊,43 ,213-242。
10. 吳梅蘭、曾哲仁(1994)。國小學童數學態度及其相關因素之研究。臺南師院學生學刊,13,19-38。
11. 吳武典(2005)。台灣教育改革的經驗與分析:以九年一貫課程和多元入學方案為例。當代教育研究季刊,13(1),38-68。
12. 王淑珍(2005)。小班教學之政策分析。國教新知,52(3),56-68。
13. 毛放(1998)。面向二十一世紀的小學辦學模式改革。教育資料文摘,42(2),156-161。
14. 方吉正(1999)。學校規模在教育品質的效應探討。教育資料與研究,27,51-57。
15. 徐新逸、黃麗鈴(1999)。高中生學業成就自我效能與學業成就表現之探討:影響自我效能因素與成就表現相關研究。教育與心理研究,22,267-294。