跳到主要內容

臺灣博碩士論文加值系統

(3.236.110.106) 您好!臺灣時間:2021/07/25 23:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李文森
研究生(外文):Wen-Sen Lee
論文名稱:基於Total-Variability之語言辨認系統NIST LRE 2011
論文名稱(外文):Language Recognition System Base on The Total-VariabilityNIST LRE 2011
指導教授:廖元甫廖元甫引用關係
口試委員:王逸如蔡偉和
口試日期:2012-07-30
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電腦與通訊研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2012
畢業學年度:100
語文別:中文
論文頁數:40
中文關鍵詞:語言驗證Total -VariabilityNIST LRE 2011線性識別分析法類別內共變異數正規化法
外文關鍵詞:language recognitionJoint Factor AnalysisTotal -VariabilityNIST LRE 2011Linear Discriminant AnalysisWinthin-Class Covariance Normalization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:167
  • 評分評分:
  • 下載下載:5
  • 收藏至我的研究室書目清單書目收藏:0
本論文研究主旨之目標是在多種語言環境下,建立一套語言驗證系統能對針對不同種語言作辨認,並驗證出目標語言與非目標語言。
本文提出以Total -Variability空間基底的語言驗證系統,先用Total –Variability將語言空間基底估算出來並找出語言空間的投影量,搭配上線性識別分析法(Linear Discriminant Analysis;LDA)與類別內共變異數正規化法(Winthin-Class Covariance Normalization;WCCN)的強健型特徵參數萃取並利用SVM建立語言模型。
本論文所提出的Total -Variability語言驗證系統,實踐在NIST LRE 2011的語料庫上,在不同秒數的驗證評比項目下,對本文提出的系統進行效能上的比較,在LRE 2009 Closed-Set評比上,3秒、10秒以及30秒下的EER分別能達到23.90%、13.01%以及6.28%。在LRE 2011 Language-Pairs評比上,3秒、10秒以及30秒下的最小決策成本平均能達到0.24201、0.13380以及0.073,在LRE 2011評比中,我們的系統跟其他組織所提出的複合系統比較還有很多的進步空間。


In this thesis,we propose a language recognition system base on Total Variability Space,,use the Linear Discriminant Analysis(LDA) and Winthin-Class Covariance Normalization(WCCN) for robust speech feature extraction.
The propose,language recognition system are evaluated on the NIST LRE 2011 database,the system length,for LRE 2009 Closed-Set evaluation,the EER in 3,10 and 30 second achieved are 23.90%,13.01% and 6.28% respectively, for the LRE 2011 Language-Pair evaluation,the MCV in 3,10 and 30 second achieved are 0.24201,0.1338 and 0.073 respectively.


中文摘要 i
英文摘要 ii
誌謝 iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 研究動機 1
1.2 研究背景 1
1.3 研究方法 2
1.4 章節概要 4
第二章 Total Variability與語言驗證系統介紹 5
2.1 Total Variability 5
2.1.1 Total Variability模型 5
2.1.2 超參數估測 6
2.1.2.1 Total Variability空間估測 6
2.1.2.2 波氏統計量 7
2.1.3 強健型特徵參數萃取 8
2.2 語言驗證系統 12
2.2.1 系統A 12
2.2.2 系統B 13
第三章 實驗環境與結果分析 14
3.1 實驗語料庫 14
3.2 聲特徵參數強健化 15
3.3 系統效能評比與量測標準 16
3.3.1 Language Verification:Cosed-Set評比簡介 18
3.3.2 Language Verification:Language-Pairs評比簡介 18
3.4 實驗結果 19
3.4.1 Cosed-Set實驗結果 19
3.4.2 Language-Pairs實驗結果 29
第四章 結論與未來展望 34
4.1 結論 34
4.1.1 LRE Closed-Set 34
4.1.2 LRE 2011 Language-Pairs 36
4.2 未來展望 38
參考文獻 39


[1]NIST LRE2011, http://www.itl.nist.gov/iad/mig//tests/lre/2011/index.html.
[2]M. A. Zissman, “Comparison of four approaches to automatic language identific-
ation of telephone speech, ” IEEE Transactions on Speech and Audio Processing.,vol. 4, 1996.
[3]Pedro A. Torres-Carrasquillo, Elliot Singer, Mary A. Kohler, Richard J. Greene, Douglas A. Reynolds, and J.R. Deller, Jr, “Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features,” in Proc. ICSLP, 2002.
[4]Kenny P., Ouellet P., Dehak N., Gupta V., and Dumouchel P., “A Study of Inter-Speaker Variability in Speaker Verification,” IEEE Transactions on Audio Speech and Language Processing, vol. 16, no. 5, pp. 980-988, Jul. 2008.
[5]Dehak, N., Kenny, P., Dehak, R., Dumouchel P., and Ouellet P., “Front-End Factor Analysis for Speaker Verification,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 8, pp. 1-28. Jul. 2010.
[6]A. Hatch, S. Kajarekar, and A. Stolcke, “Within-Class Covariance Normalization for SVM-Based Speaker Recognition,” in International Conference on Spoken Language Processing, Pittsburgh, PA, USA, Sept. 2006, pp. 1471-1474.
[7]W. M. Campbell, D. E. Sturim, D. A. Reynolds, and A. Solomonoff, “SVM based speaker verification using a GMM supervector kernel and NAP variability compensation,” in Proc. ICASSP, France, May 2006, pp. 97-100.
[8]C. Auckenthaler, and Lloyd-Thomas, “Score Normalization for Text-independent Speaker Verification System,” Digital Signal Processing, vol. 10 No1-3, 2000.
[9]C.P. Chen, K. Filali and J. Bilmes, “Frontend Post-Processing and Backend Model Enhancement on the Aurora 2.0/3.0 Databases,” in Proc. ICSLP, pp. 241–244, 2002.
[10]S. H. Lin, Y. M. Yeh, and B. Chen, “A Comparative Study of Histogram Equalization for Robust Speech Recognition,” International Journal of Computational Linguistics & Chinese Language Processing, vol. 12, no. 2, pp. 217-238, Jun. 2007.
[11]Kenny P., “Joint factor analysis of speaker and session variability: Theory and algorithms,” Technical report CRIM-06/08-13 Montreal, CRIM, 2005.
[12]Dehak, N., “Discriminative and Generative Approches for Long- and Short-Term Speaker Characteristics Modeling: Application to Speaker Verification,” Ph.D. thesis, ’Ecole de Technologie Sup’erieure, Montreal, 2009
[13] The LNKnet Toolkit, http://www.ll.mit.edu/mission/communications/ist/lnknet/index.html.
[14] Robbie Vogt, Sachin Kajarekar, and Sridha Sridharan, “Discriminant NAP for SVM Speaker Recognition,” in Odyssey 2008: The Speaker and Language Recognition Workshop, Stellenbosch, Jan. 2008, pp. 629-634.



QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top