跳到主要內容

臺灣博碩士論文加值系統

(44.200.122.214) 您好!臺灣時間:2024/10/06 03:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:黃彥學
研究生(外文):Huang, Ian Shiue
論文名稱:自動樂器家族分類
論文名稱(外文):Music Instrument Family Classification
指導教授:劉奕汶
指導教授(外文):Liu, Yi Wen
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:105
語文別:英文
論文頁數:59
中文關鍵詞:音樂訊號處理機器學習音色分類
外文關鍵詞:music signal processingmachine learningtimbre classification
相關次數:
  • 被引用被引用:0
  • 點閱點閱:278
  • 評分評分:
  • 下載下載:24
  • 收藏至我的研究室書目清單書目收藏:1
常見的樂團通常包含了五個不同的樂手,分別是主唱、電吉他手、電貝斯手、鼓手以及鍵盤手,其中鍵盤手常見的問題為,市面上缺少著鍵盤手的樂譜,以至於需要參考其他樂手的譜以了解整首歌的進行,但通常這些譜是缺少樂器資訊的,使用者並無法得知某個時間點需要在鍵盤上模擬的樂器為何,為了解決這樣的問題,我們用了預錄好的三十種不同的樂器音檔,形成了六種不同的樂器家族的一秒檔案,並且利用這六種家族有次序的混合產生十五種雙重樂器以及二十種三重樂器的資料,這些加起來有四十一種類別的一秒音檔分別取了時域訊號以及頻域訊號堆疊起來當作特徵向量,並且透過一些機器學習演算法,使系統能自動分類樂器,本文獻的結果為,最近鄰居法於驗證(validation)與實測(testing)有最好的精準度,分別是71.1%以及65.2%。此外,我們也提出了十題的聽力測試,分別是九題的兩秒音檔以及一題的陷阱題,九題中的每一題多選題均須回答全對才算答對了完整一題,陷阱題須回答對才算有效樣本,否則為無效樣本,這樣的測試是為了檢測我們所使用的演算法是否超越了人類的能力,總共參與的樣本數有498人,但有效樣本數只有301人,這些人依照音樂能力分了三個等級,等級最高的人群確實表現超越了系統,但平均而言,機器的能力是大於人類的。
A typical music band is composed of a vocal, an electric guitarist, an electric bassist, a drummer, and a keyboardist. The task of a keyboardist is to utilize the music instruments plugged-in in a keyboard appropriately. Nevertheless, keyboard sheets are hard to obtain. A keyboard beginner usually refers to guitar tabs to practice, thus the information of the instruments decision is lost. In this thesis, we have built a system of classification in an attempt to solve this problem. Each music instrument family data is composed of various pitches in 1 second. Also, duo-timbre and trio-timbre are mixed in order to generate mixtures and they serve as different labels. Their feature vectors are composed of a low-pass filtered power spectrogram, a high-pass filtered power spectrogram, a chromagram, and the time domain waveform. Several machine learning methods have been applied respectively, yet not all of the methods perform well. The k-nearest neighbors method has the most accurate result in both validation step (71.1%) and testing step (65.2%). We also have carried out a hearing test in order to understand whether the ability of classification for humans can compete with computers. As a result, humans’ accuracy is lower than computers’ in average.
摘要 i
Abstract ii
1. Introducion 1
1.1 Music instruments 2
1.2 Timbres 2
1.3 Music instrument families 5
1.4 Literature review 6
1.5 Motivation 7
2. Methods 10
2.1 Training databases 10
2.2 Feature extraction 11
2.2.1 Time domain features 13
2.2.2 Frequency domain features 13
2.2.3 Pooling 17
2.2.4 Summary 19
2.3 Machine learning algorithms 19
2.3.1 k-nearest neighbors 20
2.3.2 Support vector machines 21
2.3.3 Neural networks 22
2.3.4 Nearest neighbor of sparse coding 27
2.3.5 Principal components analysis 30
2.4 Testing database 30
2.5 Block diagrams 32
2.6 Hearing test 34
3. Results 36
3.1 Cross-validation 36
3.2 Testing 42
3.3 Hearing tests and overall accuracy 44
3.4 Summary 45
4. Discussion 47
4.1 classifying distribution 47
4.2 kNN versus NNSC 49
4.3 Hearing tests 51
5. Conclusion and future works 53
Reference 55
Appendix 57

[1] A. de Cheveigné, & H. Kawahara. (2002). YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4), 1917-1930
[2] A. Noll. (1967). Cepstrum Pitch Determination. Journal of the Acoustical Society America, 41(2), 293-309.
[3] T. Fujishima. (1999). Realtime chord recognition of musical sound: A system using common lisp music. In Proceedings of the International Computer Music Conference, 464-467.
[4] A. Sheh, & Daniel, P.W. Ellis. (2003). Chord Segmentation and Recognition using EM-Trained Hidden Markov Models. In Proceedings of the International Conference on Music Information Retrieval, 3, 183-189.
[5] Hung-Chen Chen, & Arbee, L. P. Chen. (2001). A music recommendation system based on music data grouping and user interests. Proceedings of the tenth international conference on Information and knowledge management, 231-238.
[6] Ja-Hwung Su, Hsin-Ho Yeh, Philip S. Yu, & Vincent S., Tseng. (2010). Music Recommendation Using Content and Context Information Mining. IEEE Intelligent Systems, 25(1), 16-26.
[7] J. C. Brown. (1999). Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. The Journal of the Acoustical Society of America, 105(3), 1933-1941.
[8] T. Kitahara, M. Goto, & H. G. Okuno. (2003). Musical instrument identification based on F0-dependent multivariate normal distribution. In Proceedings of Acoustics, Speech, and Signal Processing, 5, V-421.
[9] J. Marques, & P. Moreno. (1999) A study of musical instrument classification using Gaussian mixture models and support vector machines. Compaq, 99(4).
[10] A. Eronen. (2001). Comparison of features for musical instrument recognition. In Proc. IEEE Workshop Appl. Signal Process, Audio Acoust., 19–22.
[11] J. C. Brown, O. Houix, & S. McAdams. (2001). Feature dependence in the automatic identification of musical woodwind instruments. The Journal of the Acoustical Society of America, 109(3), 1064–1072.
[12] G. Agostini, M. Longari, & E. Poolastri. (2003). Musical instrument timbres classification with spectral features. EURASIP Journal on Applied Signal Processing, 2003(1). 5–14.
[13] I. Kaminskyj, & T. Czaszejko. (2005). Automatic recognition of isolated monophonic musical instrument sounds using kNN. Journal of Intelligent Information Systems, 24(2/3), 199–221.
[14] E. M. Hornbostel, & C. Sachs. (1914). Zeitschrift für Ethnologie German: Braunschweig, A. Limbach.
[15] S. Md Saha Goutam. (2012). Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Communication, 54(4), 543–565.
[16] http://impossible-music.wikia.com/wiki/Microsoft_GS_Wavetable_Synth
[17] B. Gold, N. Morgan, & D. Ellis. (2011). Speech and audio signal processing: processing and perception of speech and music. John Wiley & Sons.
[18] N. S. Roger. (1964). Circularity in judgments of relative pitch. Journal of the Acoustic Society of America, 36(212), 2346–2353.
[19] T. Cho, & J. P. Bello. (2014). On the relative importance of individual components of chord recognition systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(2), 477-492.
[20] B. E. Boser, I. M. Guyon, & V. N. Vapnik. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory, 144-152.
[21] M. B. Christopher. (2006). Pattern Recognition and Machine Learning (1st ed). America: Springer.
[22] Hsu Chih-Wei, & Lin Chih-Jen (2002). A Comparison of Methods for Multiclass Support Vector Machines. IEEE Transactions on Neural Networks.
[23] Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael, Ringgaard, & Chih-Jen Lin. (2010). Training and testing low-degree polynomial data mappings via linear SVM. J. Machine Learning Research, 11, 1471–1490.
[24] F. Pedregosa et al. (2011). Scikit-learn: Machine Learning in Python. JMLR 12, 2825-2830.
[25] D. E. Rumelhart, G. E. Hinton, & R. J. Williams. (1988). Learning representations by back-propagating errors. Cognitive modeling, 5(3), 1.
[26] S. Shai. (2011). Online Learning and Online Convex Optimization. Foundations and Trends® in Machine Learning, 107–194.
[27] J. Mairal, F. Bach, J. Ponce, & G. Sapiro. (2009). Online dictionary learning for sparse coding. In Proceedings of the 26th annual international conference on machine learning, 689-696.
[28] M. Schmidt. (2005). Least squares optimization with l1-norm regularization. CS542B Project Report of The University of British Columbia, 14-18.
[29] K. Pearson. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine, 2(6), 559–572.
[30] J. P. Bello et al. (2005). A tutorial on onset detection in music signals. IEEE Transactions on speech and audio processing, 13(5), 1035-1047.

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top