(3.237.178.91) 您好!臺灣時間:2021/03/07 01:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:周歆
研究生(外文):Chou Hsin
論文名稱:基於雙工音高感知模型之神經網路旋律抽取演算法
論文名稱(外文):A neural network based on duplex model of pitch perception for melody extraction
指導教授:冀泰石
指導教授(外文):Che,Tai-Shih
口試委員:王逸如曹昱蘇黎
口試委員(外文):Wang,Yih-RuTsao,Yu.Su,Li
口試日期:2017-9-13
學位類別:碩士
校院名稱:國立交通大學
系所名稱:工學院聲音與音樂創意科技碩士學位學程
學門:工程學門
學類:其他工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:106
語文別:中文
論文頁數:54
中文關鍵詞:旋律抽取卷積神經網路音高感知模型
外文關鍵詞:Melody extractioncovolutional
相關次數:
  • 被引用被引用:0
  • 點閱點閱:73
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文根據聽覺的觀點提出利用類神經網路建構旋律抽取的方法,針對複音音樂進行旋律的抽取。根據傳統心理聲學音高分析理論,人在音高的解析分為頻譜模型和時間模型。在此論文中,我們先對個別模型進行探討並建構模型評比效能,觀察個別模型的訓練結果與聽覺理論是否相同,並依據結果建構出頻譜模型上的聽覺模板。再進一步針對頻譜模型上高頻諧音無法解析的缺失利用時間模型補足,建構出雙工模型。由實驗結果可知由時間模型補足頻譜模型無法解析的頻段有助於提升旋律抽取及音高判別。此實驗結果也證明以心理聲學為基礎來建構類神經網路確實可用於音樂資訊檢索的相關應用中。
In this thesis, we build up a melody extraction algorithm for polyphonic music using neural networks (NNs) by imitating human pitch perception. There are two pitch perception models, the spectral model and the temporal model, in accordance with whether harmonics are resolved or not by human hearing. Here, we first use NNs to implement each of the models and evaluate their performance in the task of melody extraction. Then, we compare training results of the implemented NNs to outcomes of the pitch perception theory. Finally, we combine the NNs of the spectral and temporal models to constitute the composite NN for the duplex model which complements the unresolved harmonics of the spectral model by the temporal model. Simulation results show that the proposed composite NN based on the duplex model of pitch perception is more effective in melody extraction than other conventional methods.
摘要 iii
Abstract iv
目錄 v
圖目錄 vii
表目錄 ix
第一章 緒論 1
1.1 研究背景 1
1.2 研究方向與目標 2
1.3 章節介紹 2
第二章 聽覺感知 3
2.1 生理聽覺現象與特性 3
2.2 音高、基頻及旋律 7
2.2.1 音高 7
2.2.2 基頻 7
2.2.3 音樂音高 7
2.2.4 旋律 9
第三章 旋律抽取之相關研究 10
3.1 專家系統 10
3.1.1 Tony 10
3.1.2. Melodia 12
3.2 深度學習 14
3.2.1 MCDNN 14
3.2.2 BLSTM 15
第四章 提出系統架構 19
4.1 卷積神經網路簡介 19
4.2 音高感知模型 21
4.3提出類神經網路模型架構及流程 24
4.3.1 頻譜模型 24
4.3.2 時間模型 29
4.3.3 雙工模型 30
4.3.4 維特比解碼 31
4.4 評量方式 32
第五章 實驗設計與結果 34
5.1 實驗資料 34
5.2 實驗設定 35
5.3 實驗結果 38
第六章 結論 50
參考資料 51
[1] GOTO, Masataka. A real-time music-scene-description system: Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication, 43.4: 311-329, 2004.
[2] T. S. Chi, class notes of Auditory and Acoustical Information Processing, Department of Communication Engineering, National Chiao-Tung University, Taiwan, 2013.
[3] Chen, Jing, Thomas Baer, and Brian CJ Moore. "Effect of enhancement of spectral changes on speech intelligibility and clarity preferences for the hearing impaired." The Journal of the Acoustical Society of America 131.4: 2987-2998, 2012.
[4] Takashi Yamauchi, Presentation on theme: "Sensation & Perception", http://slideplayer.com/slide/6639448/
[5] PATTERSON, Roy D.; ALLERHAND, Mike H.; GIGUERE, Christian. Time‐domain modeling of peripheral auditory processing: A modular architecture and a software platform. The Journal of the Acoustical Society of America, 98.4: 1890-1894, 1995.
[6] RITSMA, Roelof J. Existence region of the tonal residue. I. The Journal of the Acoustical Society of America, 34.9A: 1224-1229, 1962.
[7] YOST, William A. Pitch strength of iterated rippled noise. The Journal of the Acoustical Society of America, 100.5: 3329-3335, 1996.
[8] PRESSNITZER, Daniel; PATTERSON, Roy D.; KRUMBHOLZ, Katrin. The lower limit of melodic pitch. The Journal of the Acoustical Society of America, 109.5: 2074-2084, 2001.
[9] Wiwi Kuan, presentation note “Why you can hear melody?”, https://www.youtube.com/watch?v=NQkraA1I4VM
[10] POLINER, Graham E., et al. Melody transcription from music audio: Approaches and evaluation. IEEE Transactions on Audio, Speech, and Language Processing, 15.4: 1247-1256, 2007.
[11] DEUTSCH, Diana. Two‐channel listening to musical scales. The Journal of the Acoustical Society of America, 57.5: 1156-1160, 1975.
[12] SALAMON, Justin; GÓMEZ, Emilia. Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20.6: 1759-1770, 2012.
[13] DURRIEU, Jean-Louis, et al. Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Transactions on Audio, Speech, and Language Processing, 18.3: 564-575, 2010.
[14] Mauch, Matthias, and Simon Dixon. "pYIN: A fundamental frequency estimator using probabilistic threshold distributions." Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. IEEE, 2014.
[15] BITTNER, Rachel M., et al. Melody Extraction by Contour Classification. In: ISMIR, p500-506, 2015.
[16] MAUCH, Matthias, et al. Computer-aided melody note transcription using the Tony software: Accuracy and efficiency. 2015.
[17] De Cheveigné, Alain, and Hideki Kawahara. "YIN, a fundamental frequency estimator for speech and music." The Journal of the Acoustical Society of America 111.4 1917-1930, 2002.
[18] KLAPURI, Anssi P. Multiple fundamental frequency estimation based on harmonicity and spectral smoothness. IEEE Transactions on Speech and Audio Processing, 11.6: 804-816, 2003.
[19] HAN, Kun; WANG, DeLiang. Neural network based pitch tracking in very noisy speech. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) , 22.12: 2158-2168, 2014.
[20] Verma, Prateek, and Ronald W. Schafer. "Frequency Estimation from Waveforms Using Multi-Layered Neural Networks." INTERSPEECH. 2016.
[21] Liu, Yuzhou, and DeLiang Wang. "Time and frequency domain long short-term memory for noise robust pitch tracking." Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. IEEE, 2017.
[22] Han, Kun, and DeLiang Wang. "Neural networks for supervised pitch tracking in noise." Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. IEEE, 2014.
[23] SU, Hong, et al. Convolutional neural network for robust pitch determination. In: Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. IEEE p. 579-583, 2016.
[24] Rigaud, François, and Mathieu Radenen. "Singing Voice Melody Transcription Using Deep Neural Networks." ISMIR. 2016.
[25] LEGLAIVE, Simon; HENNEQUIN, Romain; BADEAU, Roland. Singing voice detection with deep recurrent neural networks. In: Acoustics, Speech and Signal Processing (ICASSP) IEEE International Conference on. IEEE p. 121-125, 2015.
[26] KUM, Sangeun; OH, Changheun; NAM, Juhan. Melody Extraction on Vocal Segments Using Multi-Column Deep Neural Networks. In: ISMIR p. 819-825, 2016.
[27] Ono, Nobutaka, et al. "Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram." Signal Processing Conference, 16th European. IEEE, 2008.
[28] Tachibana, Hideyuki, et al. "Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source." Acoustics speech and signal processing (ICASSP), ieee international conference on. IEEE, 2010.
[29] Wikipedia, Bidirectional recurrent neural networks, https://en.wikipedia.org/wiki/Bidirectional_recurrent_neural_networks
[30] Wikipedia, Long short-term memory, https://en.wikipedia.org/wiki/Long_short-term_memory
[31] YOST, William A. Pitch perception. Attention, Perception, & Psychophysics, , 71.8: 1701-1715, 2009.
[32] MEDDIS, Ray; O’MARD, Lowel. A unitary model of pitch perception. The Journal of the Acoustical Society of America, 102.3: 1811-1820, 1997
[33] Robert P. Carlyon, Comments on “A unitary model of pitch perception” [J. Acoust. Soc. Am. 102, 1811–1820 (1997)], The Journal of the Acoustical Society of America 104, 1118 ,1998.
[34] SHAMMA, Shihab; KLEIN, David. The case of the missing pitch templates: How harmonic templates emerge in the early auditory system. The Journal of the Acoustical Society of America, 107.5: 2631-2644, 2000.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔