跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.173) 您好!臺灣時間:2024/12/02 01:23
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:高志杰
研究生(外文):Chih-Chieh Kao
論文名稱:粒子群演算法應用於梅爾濾波器組之研究
論文名稱(外文):PSO Algorithm for Mel- Filterbank
指導教授:莊堯棠
指導教授(外文):Y.-T. Juang
學位類別:碩士
校院名稱:國立中央大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:61
中文關鍵詞:梅爾濾波器組粒子群演算法梅爾倒頻譜系數關鍵詞萃取
外文關鍵詞:Mel- FilterbankPSOMFCCkeyword spotting
相關次數:
  • 被引用被引用:4
  • 點閱點閱:217
  • 評分評分:
  • 下載下載:17
  • 收藏至我的研究室書目清單書目收藏:0
本論文主要針對特徵值擷取方法梅爾倒頻譜係數MFCC 中的梅爾濾波器組做研究。 在基於粒子群演算法最佳化濾波器組的中心頻率與邊界頻率上,提出不同於一般使用辨識率當適應函數的方法,而是以統計曲線與濾波器組包絡線的相似度做為適應函數進行最佳化,而本論文依照語音訊號在能量頻譜上的特性,以能量統計圖及能量差異性統計圖為依據,得到兩組最佳化的結果,並分別進行關鍵詞辨識和三種常見雜訊環境下的測試。 最後的實驗結果顯示,此方法有提升特徵值擷取效果的能力,提高了關鍵詞萃取系統的辨識率,且在強健性上亦含有特定環境的抗雜訊能力。
In this thesis, a study for feature extraction using filter bank applied to mel frequency cepstrum coefficients (MFCC) is presented. We propose a novel approach to use particle swarm optimization (PSO) to optimize the parameters of MFCC filterbank, such as the central and side frequencies. The proposed PSO algorithm utilizes filter similarity between statistical curve and filterbank’s envelope as fitness function. According to the energy and energy difference statistical charts that comply with characteristics of the speech signal in the energy spectrum, we obtained two optimal results by PSO. Then keyword recognization and three noisy environments are considered for tests. The results of our experiments show that the proposed method improves the recognition rate of keyword spotting system and the robustness against the testing noisy environments.
摘要....................... I
Abstract.....................II
致謝.....................III
目錄.....................IV
圖目錄......................VI
表目錄.................... VII
附錄.......................VIII
第一章 緒論...................1
1.1 研究動機....................1
1.2 文獻探討....................1
1.3 章節架構....................4
第二章 背景知識.....................5
2.1 特徵參數擷取................5
2.1.1 MFCC ................5
2.1.2 LPCC................12
2.2 特徵參數的補償...............13
2.2.1 倒頻譜消去法 (CMS) ..............13
2.2.2 倒頻譜平均值與變異數正規化法 (CMVN)........15
2.3 隱藏式馬可夫模型................16
2.4 聲學模型..................17
第三章 粒子群演算法應用於濾波器組.............21
3.1 粒子群演算法...................21
3.1.1 粒子群演算法模式..............21
3.1.2 慣性權重...............24
3.2 PSO 用於最佳化濾波器組...............25
3.2.1 變數設定...............25
3.2.2 適應函數 (fitness function)..........26
第四章 實驗結果...................29
4.1 關鍵詞萃取..................29
4.1.1 關鍵詞萃取架構..............29
4.1.2 辨識流程...............32
4.2 實驗環境.................33
4.3 通道效應實驗...................34
4.4 PSO 最佳化濾波器組實驗...............37
4.5 雜訊環境實驗...................41
第五章 結論與未來展望.................46
5.1 結論.....................46
5.2 未來展望..................47
參考文獻.......................48
[1] Aggarwal, R. K. and Dave, M., “Filterbank optimization for robust ASR using GA and PSO,” International Journal of Speech Technology, Vol.15, pp. 191-201, 2012.

[2] Bou-Ghazale, S. E. and Hansen, J. H. L., “A comparative study of traditional and newly proposed features for recognition of speech under stress,” IEEE Transactions on Speech and Audio Processing, Vol.8, pp. 429-442, 2000.

[3] Bradbury, J., “Linear Predictive Coding,” Online PDF,pp.1-23, 2000.

[4] Chakroborty, S. and Goutam, S., “Improved Text-Independent Speaker Identification using Fused MFCC & IMFCC Feature Sets based on Gaussian Filter,” International Journal of Signal Processing, Vol.5, pp. 1-9, 2009.

[5] Charbuillet, C., Gas, B., Chetouani, M., and Zarader, J., “Multi Filter Bank Approach for Speaker Verification Based on Genetic Algorithm,” NOLISP, pp. 105-113, 2007.

[6] Hung, W. and Wang, H., “On the use of weighted filter bank analysis for the derivation of robust MFCCs,” IEEE Signal Processing Letters, Vol.8, pp. 70-73, 2001.

[7] Kennedy, J. and Eberhart, R., “Particle swarm optimization,” IEEE International Conference on, Vol.4, pp.1942-1948, 1995.

[8] Lee, C., Hyun, D., Choi, E., Go, J. and Lee, C., “Optimizing feature extraction for speech recognition,” IEEE Transactions on Speech and Audio Processing, Vol.11, pp. 80-87, 2003.

[9] Nickel, R. M., “Feature-Automatic Speech Character Identification,” IEEE Circuits and Systems Magazine, pp. 10-31, 2006.


[10] Ney, H., “The use of a one stage dynamic programming algorithm
for connected word recognition,” IEEE Acoustic, Speech Signal,
Processing, Vol. 32, pp. 263-271, 1984.

[11] Rosenberg, A. E., Lee, C. H. and Soong, F. K., “Cepstral channel
normalization techniques for HMM-based speaker verification,”
International Conference on Spoken Language Processing (ICSLP), pp. 1835-1838, 1994.

[12] Rabiner, L. R., “A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition,” IEEE Proceedings, Vol.77, pp. 257-286, 1989.

[13] Rose, R. C. and Paul, D. B., “A hidden Markov model based
keyword recognition system,”, IEEE Acoustics, Speech, and Signal Processing, pp.129-132, 1990.

[14] Shi, Y. and Eberhart, R., “A modified particle swarm optimizer,”
IEEE International Conference on Evolutionary Computation Proceedings, pp. 69-73, 1998.

[15] Schafer, R. W. and Wbiner, L., “Digital representations of speech
signals,” IEEE Journals & Magazines, Vol.63, pp. 662-677, 1975.

[16] Shannon, B. J. and Paliwal K. K., “Feature extraction from
higher-lag autocorrelation coefficients for robust speech recognition,” ScienceDirect Speech Communication, Vol.48, pp. 1458-1485, 2006.

[17] Skowronski, M. and Harris, J., “Increased mfcc filter bandwidth for
noise-robust phoneme recognition,”IEEE Acoustics, Speech and Signal Processing, Vol.1, pp. 801-804, 2002.

[18] Skowronski, M. and Harris, J., “Improving the filter bank of a
classic speech feature extraction algorithm,” International Symposium on Circuits and Systems (ISCAS), Vol.4, pp. 281-284, 2003.

[19] Tiberewala, S. and Hermansky, H., "Multiband and adaptation
approaches to robust speech recognition", Eurospeech97, 1997, pp. 107-110, 1997.

[20] Vignolo, L. D., Rufiner, H. L., Milone, D. H. and Goddard, J. C.,
“Genetic optimization of cepstrum filterbank for phoneme classification,” Bio-inspired Systems and Signal Processing, pp. 179-185, 2009.

[21] Vignolo, L. D., Rufiner, H. L., Milone, D. H. and Goddard, J. C.,
“Evolutionary cepstral coefficientts,” ScienceDirect Applied Soft Computing, Vol.11, pp. 3419-3428, 2011.

[22] Viikki, O. and Laurila, K., “Cepstral domain segmental feature
vector normalization for noise robust speech recognition,” ScienceDirect Speech Communication, Vol. 25, pp. 133-147, 1998.

[23] Wu, J. and Yu, J., “An Improved Arithmetic of MFCC in Speech
Recognition System,” International Conference on Electronics, Communications and Control (ICECC), pp 719-722, 2011.

[24] Zheng, F., Zhang, G. and Song, Z., “Comparison of different
implementations of MFCC,” Journal of Computer Science and Technology, Vol.16, pp. 582-589, 2001.

[25] Zabidi, A., Mansor, M., Lee, Y. K., Yassin, I. M. and Sahak , R.,
“Discrete Mutative Particle Swarm Optimisation of MFCC computation for classifying hypothyroidal infant cry,” Computer Applications and Industrial Electronics(ICCAIE), pp.588-592, 2010.

[26] 蔡炎興, “關鍵詞萃取即語者辨識系統之研製,” 國立中央大學碩
士論文, 2003.
[27] 簡忠弘, “關鍵詞辨認系統的研究與實現,” 國立清華大學碩士論
文, 1997.
[28] 王小川,“語音訊號處理,” 全華圖書股份有限公司, 2009.

[29] “國音學,” 國立臺灣師範大學國音教編輯委員會,2001.

[30] “大五碼,” 台灣財團法人資訊工業策進會,1983.

[31] “MAT Speech Database,” 中華民國計算語言學學會
http://www.aclclp.org.tw/doc/mat2500_brief.pdf

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top