跳到主要內容

臺灣博碩士論文加值系統

(34.204.180.223) 您好!臺灣時間:2021/08/03 22:42
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林威成
研究生(外文):Wei-Cheng Lin
論文名稱:利用音高與音色特徵進行鳥類識別之研究
論文名稱(外文):Bird Species Identification Based on Timbre and Pitch Feature of Sound recordings
指導教授:蔡偉和蔡偉和引用關係
指導教授(外文):Wei-Ho Tsai
口試委員:尤信程張智星
口試委員(外文):Xin-Cheng YouChih-Hsing Chang
口試日期:2012-01-13
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電腦與通訊研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2012
畢業學年度:100
語文別:中文
論文頁數:48
中文關鍵詞:鳥鳴聲音高音色雙連文模型高斯混合模型
外文關鍵詞:Bird soundsPitchTimbreBigram ModelGaussian Mixture Model
相關次數:
  • 被引用被引用:0
  • 點閱點閱:171
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
為了讓民眾可以很快地從鳥鳴聲中知道這是什麼鳥的種類,本論文提出一個自動識別鳥鳴聲串的系統,其中的鳥鳴聲串可能包含許多在不同時間所錄下的不同鳥鳴聲,因此系統必須識別哪些區段內含有哪種鳥鳴聲。我們採用音高與音色兩種特徵參數來進行識別。
在利用音高特徵的分析上,我們把鳥鳴聲串的訊號轉換成音符的形式,並且利用雙連文模型來凝聚每種鳥在音高上的動態變化資訊,進而識別出鳥種。在利用音色特徵的分析上,我們採用了梅爾倒頻譜係數來擷取音色的特徵,並且利用高斯混合模型表示出每種鳥在音色上的共有特徵,進而識別出鳥種。
我們收集了20種鳥共2815個聲音樣本進行實驗驗證,並將所有鳥鳴聲隨機串成一長度為885秒的聲音檔案。結果顯示在使用音高、音色、及結合音高與音色特徵的識別率分別為:50.2%、82.45%、85.28%。


To help people learn bird species from their sounds, this study proposes an automatic system that identifies bird sounds in a long audio recording. For each instant of a long audio recording, we perform timbre-based and pitch-based analyses.
In the timbre-based analyses, Mel-frequency cepstrum coefficients are extracted from every short segment, and then tested by a Gaussian Mixture Model Classifier. In the pitch-based analysis, we convert sound signals from their waveform representations into a sequence of MIDI notes. Then, Bigram models are used to analyze the dynamic change information of the notes.
The database used in this thesis consists of 2815 sound recordings from 20 bird species. We further concatenated all the recordings into an 885-sec long audio stream. Our experiments show that the identification accuracies obtained with pitch-based analysis, timbre-based analysis, and combined pitch and timbre-based analysis are 50.2%, 82.45% ,85.28%, respectively.


目 錄

中文摘要 i
英文摘要 ii
致謝 iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 研究動機與目的 1
1.2 相關文獻 2
1.3 相關鳥鳴聲知識 5
1.4 研究架構 7
1.5 論文編排 8
第二章 研究方法介紹 9
2.1 利用音高特徵識別鳥鳴聲 9
2.1.1 前言 9
2.1.2 次諧波總和法求取音高特徵 11
2.1.2.1 音框化和漢明窗 12
2.1.2.2 快速傅立葉轉換 13
2.1.2.3 轉換刻度 15
2.1.2.4 次諧波總和與其最大值 15
2.1.3 Bi-gram model 16
2.2 利用音色特徵識別鳥鳴聲 17
2.2.1 前言 17
2.2.2 求取音色特徵 18
2.2.2.1 前端處理 19
2.2.2.2 梅爾刻度三角帶通濾波器 21
2.2.2.3 離散餘弦的轉換(Discrete Cosine Transform;DCT) 22
2.2.3 通用高斯混合模型調適出鳥鳴聲模型 24
2.2.3.1 貝氏調適法(Bayesian adaptation;MAP) 25
第三章 結合音高與音色特徵識別系統 26
3.1 前言 26
3.2 Frame決策 26
3.3 權重函數 27
3.4 系統架構 28
第四章 鳥鳴聲串之識別實驗 29
4.1 採用工具與資料庫 29
4.1.1 測試鳥鳴聲串音檔的產生 31
4.2 單一鳥鳴聲音檔之特徵函數識別結果 32
4.2.1 擷取音高特徵識別結果 32
4.2.2 擷取音色特徵識別結果 35
4.2.3 結合音色與音高特徵識別結果 38
4.3 結合鳥鳴聲音檔之特徵函數識別結果 41
4.3.1 擷取音高特徵識別結果 41
4.3.2擷取音色特徵識別結果 41
4.3.3結合音高與音色特徵識別結果 42
第五章 結論 45
參考文獻………………………………………………………………………… …46


參考文獻

[1]S. E. Anderson, A. S. Dave, and D. Margoliash, “Template-based automatic recognition of birdsong syllables from continuous recordings,” J. Acoust. Soc. Amer., vol. 100, no. 2, pp. 1209–1219, Aug. 1996.
[2]J. Kogan and D. Margoliash, “Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative study,” J. Acoust. Soc. Amer., vol. 103, no. 4, pp. 2187–2196, Apr. 1998.
[3]A. L. McIlraith and H. C. Card, “Birdsong recognition using backpropagation and multivariate statistics,” IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2740–2748, Nov. 1997.
[4]Nieves, M.G and Acevedo, C.M.D, “Integrated System Approach for the Automatic Speech Recognition using Linear predict Coding and Neural Networks,” in Proc. IEEE Int. Conf., Electronics, Robotics and Automotive Mechanics., 2007, pp. 207–212.
[5]A. Harma, “Automatic identification of bird species based on sinusoidal modeling of syllables,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 5, 2003, pp. 545–548.
[6]P. Somervuo, A. Harma, and S. Fagerlund, “Parametric representations of bird sounds for automatic species recognition,” IEEE Trans. Audio, Speech, Language Process., vol. 14, no. 6, pp. 2252–2263, Nov. 2006.
[7]Chang Hsing Lee, Chin Chuan Han, and Ching Chien Chuang, “Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients,” IEEE Trans. Audio, Speech, Language Process., vol. 16, no. 8, pp. 1541–1550, Nov. 2008.
[8]廖偉恩,基於音色與音高特徵隻鳥鳴聲辨識方法研究,碩士論文,國立臺北科技大學,台北,2011。
[9]S. Fagerlund, “Automatic Recognition of Bird Species by Their Sounds,” M.S. Thesis, Helsinki Univ. Technol., Espoo, Finland, 2004
[10]A. Mesaros, T. Virtanen, “Recognition of phonemes and words in singing,” in Proc. Of the 2010 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing(ICASSP), 2010.
[11]M. Piszczalski and B. A. Galler, “Predicting musical pitch from component frequency ratios”, Journal of the Acoustical Society of America, 66(3), pp.710—720, 1979.
[12]E. Wong and S. Sridharan, “Comparison of Linear Prediction Cepstrum Coefficients and Mel-Frequency Cepstrum Coefficients for Language Identification,” in Proc. Of the 2001 IEEE Int. Conf. on Intelligent Multimedia, Video and Speech processing(ISIMP),2001
[13]D. Reynolds and R. Rose, “Robust text independent speaker identification using Gaussian mixture speaker models,” IEEE Trans. on Speech and Audio Processing., vol. 3, no. 1, pp. 72–83, Jan. 1995.
[14]Inoue, T. et al, “Theoretical Analysis of Musical Noise in Generalized Spectral Subtraction Based on Higher Order Statistics,” IEEE Trans. on Speech and Audio Processing., vol. 19, pp. 1770-1779,2011.
[15]Ning, Ma. et al, “Speech Enhancement Using a Masking Threshold Constrained Kalman Filter and Its Heuristic Implementations,” IEEE Trans. on Speech and Audio Processing., vol. 14, pp. 19-32,2006.
[16]Jingdong Chen et al, “New Insights Into the Noise Reduction Wiener Filter,” IEEE Trans. on Speech and Audio Processing., vol. 14, pp. 1218–1234,2006.
[17]D. Reynolds and T. Quatieri, “Speaker Verification Using Adapted Gaussian Mixture Models,” Digital Signal Processing 10, PP. 19-41, 2000.
[18]Fingscheidt, T. et al“Environment-Optimized Speech Enhancement,” IEEE Trans. on Speech and Audio Processing., vol. 16, pp. 825-834,2008.
[19]王小川編著,語音訊號處理(修訂二版),全華圖書,台北,06,2008。
[20]張智星編著,MATLAB程式設計入門(第三版),碁峰,台北,01,2011
[21]楊青于,鳥聲辨識之初步研究與分析,碩士論文,國立清華大學,新竹,2005。
[22]傾聽自然論壇,“http://nature.hc.edu.tw/vbb/index.php”。


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top