(3.216.79.60) 您好!臺灣時間:2020/11/28 15:57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:蘇樺
研究生(外文):Hua Su
論文名稱:粒子群演算法之語者確認系統
論文名稱(外文):PSO Algorithm for Speaker Verification Systems
指導教授:莊堯棠
指導教授(外文):Yau-tarng Juang
學位類別:碩士
校院名稱:國立中央大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:79
中文關鍵詞:粒子群演算法語者確認
外文關鍵詞:particle swarm optimization algorithmspeaker verification
相關次數:
  • 被引用被引用:1
  • 點閱點閱:92
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在本論文中著重於語者確認後端的研究,當有了測試語料後,希望能對該測試語料做到最佳的辨識效能,因此主要的研究方向為測試語音與各註冊語者模型的處理。首先系統採用正規化計分方式,並加入粒子群演算法來優化模型參數,粒子群演算法是一種最佳化演算法,透過模擬鳥群或魚群搜索食物的方式來找尋最佳解,屬於群體智慧的方法,其粒子具有記憶性,且該演算法計算簡單與快速收斂,故將其應用於語者確認語料的建模上,藉由其優化的特性以建立更加精確的語者模型,使得系統更具有鑑別力。再者,本論文將簡單線性迴歸分析應用於語者確認系統中,簡單線性迴歸分析是統計學裡重要的分析方法,常用來分析資料之間的相關性,此處將語者確認結果建立簡單線性迴歸模型,透過普通最小平方法的估計,及判定係數的分析,對語者確認的結果做結合,使得系統對測試語音的辨識更加精準,進而提升系統的辨識效能。
This thesis focused on speaker verification between test corpus and registered speaker models. First of all, the thesis introduces score normalization approaches to the speaker verification system. Then, we apply Particle Swarm Optimization algorithm to optimize model parameters. The main idea of PSO method is like fish foraging behavior. All particles of PSO have memories. The algorithm has simple calculation and fast convergence. With its optimized features to build a more accurate speaker model, the system is more discernment.
In addition, the thesis also introduces a regression analysis method to speaker verification system. Regression analysis is a useful statistics analysis method. We build the regression model for each speaker by ordinary least squares estimation and the coefficients of determination analysis. Experiments showed that the proposed method can improve performance of the speaker verification system.
目錄
摘要I
Abstract II
目錄 III
圖目錄 V
表目錄 VI
第一章 緒論 1
1.1研究動機 1
1.2語者辨識架構概述 2
1.3語者調適概述 4
1.4研究方向 5
1.5文獻探討 5
1.6章節概要 8
第二章 語者確認系統之技術 10
2.1 特徵參數擷取 11
2.2 高斯混合模型 12
2.3 語者模型之訓練 13
2.3.1向量量化 14
2.3.2 EM演算法 17
2.4語者模型調適 18
2.4.1貝式調適法 19
第三章 語者確認 22
3.1 GMM-UBM 22
3.2 KL距離之語者確認 24
3.3 測試正規化 25
第四章 粒子群演算法 27
4.1 粒子群演算法概念 27
4.2 慣性權重 30
4.3 粒子群演算法應用於語者確認 31
第五章 迴歸分析法 35
5.1 迴歸分析法概念 35
5.2 普通最小平方法 36
4.3 判定係數 38
5.4 語者確認分數之迴歸分析 42
5.5 語者確認分數的結合 43
第六章 實驗與討論47
6.1 語音語料 47
6.2語者確認效能評估 48
6.2.1相等錯誤率 48
6.2.2決策成本函數 49
6.3 實驗結果50
6.3.1實驗一 三種確認系統之比較50
6.3.2實驗二 迴歸分析應用於語者確認之實驗 52
6.3.3實驗三 粒子群演算法應用於語者確認 54
6.3.3實驗四 迴歸分析和粒子群演算法之實驗 56
7.1結論 59
7.2 未來展望 60
參考文獻 61
[1] 呂易宸, “語音門禁系統,” 中央大學電機工程學系碩士論文, 民國100年.
[2] S. Furui, “An Overview of Speaker Recognition Technology,” Workshop on Automatic Speaker Recognition, Identification, pp. 1–9, 1994. [3] D. Burton, “Text Dependent Speaker Verification Using Vector Quantization Source Coding,” Transactions on Acoustics, Speech and Signal Processing, vol.35, pp. 133-143, 1987.
[4] A. Roland and C. Michael and L. T. Harvey, “Score Normalization for Text Independent Speaker Verification Systems,” ScienceDirect Digital Signal Processing, vol.10, pp. 42-54, 2000.
[5] B. Chen and J. W. Kuo and W. H. Tsai, “Lightly Supervised and Data Driven Approaches to Mandarin Broadcast News Transcription,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. I - 777-80, 2004.
[6] M. Bacchiani and B. Roark, “Unsupervised Language Model Adaptation,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. I-224 - I-227, 2003.
[7] 張文杰, “模型調適之語者識別系統,” 中央大學電機工程學系碩士論文, 民國94年.
[8] J. L. Gauvain and C. H. Lee, “Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains,” Transactions on Speech and Audio Processing, vol.2, no.2, pp. 291-298, 1994.
[9] C. B. de Lima and A. Alcaim and J. A. Apolinario, “On the Use of PCA in GMM and AR Vector Models for Text Independent Speaker Verification,” International Conference on Digital Signal Processing, vol.2, pp. 595-598, 2002.
[10] Y. Kida and H. Yamamoto and C. Miyajima and K. Tokuda and T. Kitamura, “Minimum Classification Error Interactive Training for Speaker Identification,” International Conference on Acoustics, Speech, and Signal Processin, vol.1, pp. 641-644, 2005.
[11] H. J. Song and H. S. Kim, “Bilinear Model Based Maximum Likelihood Linear Regression Speaker Adaptation Framework,” Signal Processing Letters, vol.16, issue 12, pp. 1063-1066, 2009.
[12] C. H. Huang and J. T. Chien and H. M. Wangb, “A New Eigenvoice Approach to Speaker Adaptation,” International Symposium on Chinese Spoken Language Processing, pp. 109-112, 2004.
[13] M. Tonomura and T. Kosaka and S. Matsunaga, “Speaker Adaptation Based on Transfer Vector Field Smoothing Using Maximum a Posteriori Probability Estimation,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. 688-691, 1995.
[14] V. Chatzis and A. G. Bors and I. Pitas, “Multimodal Decision Level Fusion for Person Authentication,” Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, vol.29, pp. 674-680, 1999.
[15] S. R. Madikeri and H. A. Murthy, “Mel Filter Bank Energy Based Slope Feature and Its Application to Speaker Recognition,” National Conference on Communications, pp. 1-4, 2011.
[16] 吳金池, “語者辨識系統之研究,” 中央大學電機工程學系碩士論文, 民國91年.
[17] J. Wu, “An Effective Hybrid Semi Parametric Regression Strategy for Artificial Neural Network Ensemble and Its Application Rainfall Forecasting,” International Joint Conference on Computational Sciences and Optimization, pp. 1324-1328, 2011.
[18] X. Bin and Z. Cang, “The Application of Multiple Regression Analysis Forecast in Economical Forecast the Demand Forecast of Our Country Industry Lavation Machinery in the Year of 2008 and 2009,” International Workshop on Knowledge Discovery and Data Mining, pp. 405-408, 2009.
[19] W. X. SUN and S. Ti and Z. Hai, “Study on Bus Passenger Capacity Forecast Based on Regression Analysis Including Time Series,” International Conference on Measuring Technology and Mechatronics Automation, vol.2, pp. 381-384, 2009.
[20] H. Gabriel and G. Jiwen and M. J. de la Paix and N. D. Jairu and B. W. Oyelola, “Statistical Analysis of Categorical Climatic Variables:Case of Temperature and Rain Fall,” International Conference on Environmental Science and Information Application Technology, vol.3, pp. 412-415, 2010.
[21] L. Guo and X. Deng, “Application of Improved Multiple Linear Regression Method in Oilfield Output Forecasting,” International Conference on Information Management, Innovation Management and Industrial Engineering , vol.1, pp. 133-136, 2009.
[22] L. Yingying and N. Dongxiao, “Application of Principal Component Regression Analysis in Power Load Forecasting for Medium and Long Term,” International Conference on Advanced Computer Theory and Engineering, vol.3, pp. V3-201-V3-203, 2010.
[23] 朱映霖, “利用支撐向量機改善最小錯誤鑑別式之語者辨識方法,” 中央大學電機工程學系碩士論文, 民國96年.
[24] A. Colorni and M. Dorigo and V. Maniezzo, “Distributed Optimization by Ant Colonies,” Appeared in Proceedings of ECAL91 European Conference on Artificial Life, Paris, France, Elsevier Publishing, pp. 134–142, 1991.
[25] A. G. Abro and J. Mohamad-Saleh, “Enhanced Global Best Artificial Bee Colony Optimization Algorithm,” Sixth UKSim/AMSS European Symposium on Computer Modeling and Simulation, pp. 95–100, 2012.
[26] B. Santosa and M. K. Ningrum, “Cat Swarm Optimization for Clustering,” International Conference of Soft Computing and Pattern Recognition, pp. 54–59, 2009.
[27] P. Guo and X. Wang and Y. Han, “The Enhanced Genetic Algorithms for the Optimization Design,” International Conference on Biomedical Engineering and Informatics, vol.7, pp. 2990–2994,2010.
[28] K. Y. Chan and C. K. F. Yiu and S. Nordholm, “Multichannel Filters for Speech Recognition Using a Particle Swarm Optimization,” International Conference on Control, Automation, Robotics and Vision, pp. 937-942, 2012.
[29] M. Sheikhan, “Hybrid of PSO and SOM Neural Network for Immittance Spectral Frequency Quantization in AMR-WB Speech Codecs,” Conference on Information and Knowledge Technology, pp. 192-196, 2013.
[30] R. Luo and W. Cai and M. Chen and D. Zhu, “An Improved Particle Swarm Optimization Algorithm for Speaker Recognition,” International Conference on Advanced Computational Intelligence, pp. 641-644, 2012.
[31] U. Mahbub and P. P. Acharjee and S. A. Fattah, “An Acoustic Echo Cancellation Scheme Based on Particle Swarm Optimization Algorithm,” IEEE Region 10 Conference, pp. 759-762, 2010.
[32] C. Y. Chen and F. Ye, “Particle Swarm Optimization Particle Swarm Optimization Algorithm and Its Application to Clustering Analysis,” International Conference on Networking, Sensing and Control, vol.2,pp. 789-794, 2004.
[33] 賴易峰, “粒子群演算法應用於語者確認系統之研究,” 中央大學電機工程學系碩士論文, 民國101年.
[34] 吳昱宏, “粒子群演算法應用於語者模型訓練與調適之研究,” 中央大學電機工程學系碩士論文, 民國102年.
[35] M. Ben and R. Blouet and F. Bimbot, “A Monte Carlo Method for Score Normalization in Automatic Speaker Verification Using Kullback-Leibler Distances,” International Conference on Acoustics, Speech and Signal Processing, vol.1, pp. I-689-I-692, 2002.
[36] D. Yuan and L. Liang and Z. Xian-Yu and Z. Jian, “Studies on Model Distance Normalization Approach in Text Independent Speaker Verification,” Acta Automatica Sinica, vol.35, pp. 556-560, 2009.
[37] R. Auckenthaler and M. Carey and H. Lloyd-Thomas, “Score Normalization for Text Independent Speaker Verification Systems,” Digital Signal Processing, vol.10, pp. 42-54, 2000.
[38] 陳俊傑, “結構化語者模型之研究,” 中央大學電機工程學系碩士論文, 民國93年.
[39] T. Watanabe and K. Shinoda and K. Takagi and K. I. Iso, “High Speed Speech Recognition Using Tree Structured Probability Density Function,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. 556-559, 1995.
[40] B. Xiang and T. Berger, “Efficient Text Independent Speaker Verification with Structural Gaussian Mixture Models and Neural Network,” Transactions on Speech and Audio Processing, vol.11, no.5, pp. 447-456, 2003.
[41] 李憲昌, “維度經驗重心分享粒子群演算法,” 中央大學電機工程學系碩士論文, 民國102年.
[42] J. Kennedy and R. Eberhart, “Particle Swarm Optimization,” International Conference on Neural Networks, vol.4, pp. 1942-1948, 1995.
[43] Y. Shi and R. C. Eberhart, “Parameter Selection in Particle Swarm Optimization,” Evolutionary Programming VII. Lecture Notes in Computer Science, vol.1447, pp. 591–600, 1998.
[44] D. Matrouf and J. F. Bonastre, “Accurate Log Likelihood Ratio Estimation by Using Test Statistical Model for Speaker Verification,” Speaker and Language Recognition Workshope, pp. 1–5, 2006.
[45] 徐明龍, “商用統計學,” 鼎茂圖書, 1版, 民國100年.
[46] 邱皓政和林碧芳和許碧純和陳育瑜, “統計學:原理與應用,” 五南, 初版, 民國101年.
[47] 劉國鑑和蔡鴻星和沈美嬌和張水清和洪念民, “統計學,” 新文京開發出版股份, 民國100年.
[48] WIKIPEDDIA The Free Encyclopedia, Available at http://zh.wikipedia.org/zh-tw/%E6%9C%80%E5%B0%8F%E4%BA%8C%E4%B9%98%E6%B3%95
[49] 蕭文龍, “多變量分析最佳入門實用書,” 碁峰資訊股份有限公司, 民國96年.
[50] 林惠玲和陳正倉, “統計學:方法與應用,” 雙葉書廊, 3版, 民國93年.
[51] 管中閔, “統計學:觀念與方法,”華泰, 2版, 民國93年.
[52] M. L. Wei and H. Y. Lu, “The Correct Use and Interpretation of the Coefficient of Determination (R2) in Regression Analysis,” vol.47, pp. 1–7, 1999.
[53] The NIST Year 2001 Speaker Recognition Evaluation, Available at http://www.itl.nist.gov/iad/mig/tests/sre/2001/index.html.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔