臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.106) 您好！臺灣時間：2026/04/05 04:26

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

林家興

研究生(外文):

Chia-Hsing Lin

論文名稱:

基於鑑別式特徵參數求取之強健性聲音事件分類

論文名稱(外文):

Discriminative Feature Extraction for Robust Audio Event Classification

指導教授:

廖元甫

口試委員:

蔡偉和、王逸如

口試日期:

2010-07-30

學位類別:

碩士

校院名稱:

國立臺北科技大學

系所名稱:

電腦與通訊研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2010

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

聲音事件分類、賈柏濾波器、語料驅動

外文關鍵詞:

audio event classification、gabor filter、data driven

相關次數:

被引用:0
點閱:298
評分:
下載:0
書目收藏:0

非語音的事件聲音，在某些特定環境下是相當有意義的資訊。本論文主要探討在非語音聲音事件分類方面，除了普遍使用的梅爾倒頻譜參數之外，是否有對於非語音的音訊更能顯現出其特別資訊的音訊特徵參數，以及能增加在雜訊環境下辨識效能的參數組合。因此，我們考慮以時頻分析和圖樣特徵的概念來擷取出音訊特徵參數，所以我們將使用賈柏濾波器參數，或利用主成分分析和線性鑑別分析等分析方法求出語料驅動濾波器參數，當作我們的新類型音訊特徵參數，最後再運用最小分類錯誤法則對已得的音訊特徵參數做微調，希望能求取出更具有鑑別力的音訊特徵參數。
實驗用的語料是 RWCP (Real World Computing Partnership) 中的105種乾淨的事件聲音，在加入Aurora 2複合情境模式的雜訊之後，使用我們設計的音訊特徵參數去訓練模型及進行測試。在實驗之後發現，我們求取出的新音訊特徵參數比起傳統音訊特徵參數的分類錯誤率從4.13％降低到3.17％，因此我們採用新類型音訊特徵參數的系統架構確實能對於聲音事件分類達到強健性的效果，也能確認新類型音訊特徵參數對於非語音訊號的適用性。

In Tradition, audio event classification relies heavily on MFCCs (Mel-Frequency Cepstral Coefficients) features. However, MFCCs is originally designed for automatic speech recognition. It is not sure whether MFCCs are still the best features for audio event classification or not. Besides, MFCCs are usually not so robust in noisy environment. Therefore, in this paper, several new feature extraction methods are proposed in the hope of getting better performance and robustness than MFCCs in noisy conditions.
The proposed feature extraction methods are mainly based on the concept of match filters in spectro-temporal domain. Several methods to design the set of match filters are proposed including handmade gabor filters and three data-driven filters using PCA (Principle Component Analysis), LDA-based Eigen-space analysis (Linear Discriminative Analysis) and MCE (Minimum Classification Error) training.
The robustness of the proposed method is evaluated on RWCP (Real World Computing Partnership) database with artificially added noise. There are 105 different audio events in RWCP. The experimental settings are similar to Aurora 2 multi-condition training task. Experimental results show that the lowest average error rate of 3.17% was achieved by MCE method and is superior to conventional MFCCs (4.13%). We thus confirm the superiority and robustness of the proposed audio feature extraction approaches.

中文摘要 i
ABSTRACT ii
誌謝 iv
目錄 v
表目錄 vii
圖目錄 viii
第一章緒論 1
1.1 研究動機與問題背景 1
1.2 研究方法 3
1.2.1 參考系統與傳統音訊特徵參數擷取 3
1.2.2 新音訊特徵參數擷取 4
1.3 主要貢獻 5
1.4 章節概要 6
第二章傳統音訊特徵參數擷取 7
2.1 梅爾倒頻譜參數 7
2.2 位移差分化倒頻譜參數 9
2.3 正規化及ARMA濾波器 10
第三章新音訊特徵參數擷取 13
3.1 賈柏濾波器 14
3.1.1 基本理論 14
3.1.2 賈柏濾波器參數應用於聲音事件分類 15
3.2 語料驅動濾波器 17
3.2.1 語料驅動濾波器實作 17
3.2.2 主成分分析 18
3.2.3 線性鑑別分析 20
3.2.4 語料驅動濾波器應用於聲音事件分類 23
第四章基於最小分類錯誤法之新音訊特徵參數擷取 24
4.1 最小分類錯誤法準則 25
4.2 轉換矩陣最佳化演算法 26
4.3 最小分類錯誤法準則應用於聲音事件分類 28
第五章實驗結果與分析 29
5.1 實驗語料庫與實驗設定 29
5.1.1 語料庫簡介 29
5.1.2 實驗設定 30
5.1.2.1 隱藏式馬可夫模型 30
5.1.2.2 傳統音訊特徵參數實驗設定 32
5.1.2.3 新音訊特徵參數實驗設定 32
5.1.2.4 最小分類錯誤法調適音訊特徵參數實驗設定 34
5.2 實驗結果 34
5.2.1 傳統音訊特徵參數辨識結果 34
5.2.2 新音訊特徵參數辨識結果 35
5.2.3 最小分類錯誤法調適音訊特徵參數辨識結果 37
5.3 實驗結果比較與分析 38
第六章結論與未來展望 46
參考文獻 47
附錄 49

[1]L. Kennedy and D. Ellis, “Laughter detection in meetings,” in NIST ICASSP Meeting Recognition Workshop, Montreal, Canada, May 2004, pp. 118-121
[2]J. Pinquier, J. L. Rouas, and R. Andrè-Obrecht, “Robust speech/music classification in audio documents,” in Proc. ICSLP, Denver, USA, Sept. 2002, vol. 3, pp. 2005-2008
[3]M. Vacher, D. Istrate, and J. F. Serigna, “Sound detection and classification through transient models using wavelet coefficient trees,” in EUSIPCO, Vienna, Austria, 2004, pp. 1171-1174
[4]L. Gerosa, G. Valenzise, F. Antonacci, M. Tagliasacchi, and A. Sarti, “Scream and gunshot detection in noisy environments,” in EURASIP European Signal Processing Conference, Poznan, Poland, Sept. 2007
[5]C. Cleval, T. Ehrette, and G. Richard, “Events detection for an audio-based surveillance system,” in Proc. ICME’05, Orsay, France, July 2005, pp. 1306-1309
[6]R. Radhakrishnan, A. Divakaran, and A. Smaragdis, “Audio analysis for surveillance applications,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct. 2005, pp. 158-161
[7]Z. Xiong, R. Radhakrishnan, A. Divakaran, and T. S. Huang, “Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework,” in ICME’03, Baltimore, USA, July 2003, vol. 3, pp. 401-404
[8]M. Slaney, “Mixtures of probability experts for audio retrieval and indexing,” in ICME’02, Ischia, Italy, July 2002, vol. 1, pp. 345-348
[9]T. Zhang and C. Kuo, “Hierarchical system for content-based audio classification and retrieval,” Conference on Multimedia Storage and Archiving Systems Ⅲ, SPIE, Oct. 1998, vol. 3527, pp. 398-409
[10]W. Huang, S. Lau, T. Tan, L. Li, and L. Wyse, “Audio events classification using hierarchical structure,” in Information, Communication and Signal Processing, Singapore, Dec. 2003, vol. 3, pp. 1299-1303
[11]A. Temko, “CLEAR 2007 AED evaluation plan,” http://isl.ira.uka.de/clear07, 2007
[12]A. Temko, C. Nadeu, and J. I. Biel, “UPC’s acoustic event detection system and results in the CLEAR’07 evaluation,” Internal UPC report, 2005
[13]A. Temko, R. Malkin, C. Zieger, D. Macho, C. Nadeu, and M. Omologo, “CLEAR evaluation of acoustic event detection and classification systems,” CLEAR’06 Evaluation Campaign and Workshop, Southampton, LNCS, Jan. 2007, vol. 4122, pp. 311-322
[14]X. Zhuang, X. Zhou, T. S. Huang and M. Hasegawa-Johnson, “Feature analysis and selection for acoustic event detection,” in Proc. ICASSP’08, Las Vegas, USA, Apr. 2008, pp. 17-20
[15]K. Schutte and J. Glass, “Speech recognition with localized time-frequency pattern detectors,” in Proc. ASRU, Kyoto, Japan, Dec. 2007, pp. 341-346
[16]M. Kleinschmidt and D. Gelbart, “Improving word accuracy with Gabor feature extraction,” in Proc. ICSLP, 2002
[17]B. H. Juang, W. Chou, and C. H. Lee, “Minimum classification error rate methods for speech recognition,” in IEEE Trans. on Speech and Audio Processing, May 1997, vol. 5, no. 3, pp. 257-265
[18]Á. Torre, A. M. Peinado, A. J. Rubio, and P. García, “Discriminative feature extraction for speech recognition in noise,” in Proc. EuroSpeech, 1997

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

1.	梁福鎮（2008）。斯泰納人智學教育學之探究。當代教育研究，16，1，（頁121-153）。
2.	高傳正（1998）。幼教教師對教學改變之看法初探。花蓮師院學報，8，（頁317－337）。
3.	陳淑敏、張玉倫（2004）。幼兒教師教學信念與教學行為之探究。屏東師院學報，21。

1.	基於快速次頻帶最大概似機率麥克風陣列演算法之強健性語音辨認
2.	建構於自行車發電機之電力管理系統及其應用
3.	基於聯合語者與雜訊環境因素分析之強健性語音辨認
4.	針對 QEMU 在混合式儲存架構下的一個有效率的 I/O 模組
5.	基於調頻諧波加噪音模型之強健性基頻求取
6.	高效能G.729語音編碼器之設計與實現
7.	利用MEMS技術於微型加熱板之設計與模擬研究
8.	運用多重諧波特性之磁通閘驅動電路設計與製作
9.	陶瓷雷射燒結法工件之燒結精度研究
10.	隧道襯砌影像判釋與異狀特徵化技術暨安檢資料庫之開發
11.	個別元素法於潛盾開挖引致礫石層沉陷特性之應用研究
12.	共蒸鍍法製備CIGS太陽電池吸收層及其元件特性研究
13.	植基於角點幾何關係之快速物件辨識與追蹤
14.	JPEG-LS編碼器之平行管線化架構實作
15.	敏捷思維於國軍資訊系統開發之研究 -以E機構資訊系統開發為例

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室