

( 您好!臺灣時間:2024/10/06 03:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Huang, Ian Shiue
論文名稱(外文):Music Instrument Family Classification
指導教授(外文):Liu, Yi Wen
外文關鍵詞:music signal processingmachine learningtimbre classification
  • 被引用被引用:0
  • 點閱點閱:278
  • 評分評分:
  • 下載下載:24
  • 收藏至我的研究室書目清單書目收藏:1
A typical music band is composed of a vocal, an electric guitarist, an electric bassist, a drummer, and a keyboardist. The task of a keyboardist is to utilize the music instruments plugged-in in a keyboard appropriately. Nevertheless, keyboard sheets are hard to obtain. A keyboard beginner usually refers to guitar tabs to practice, thus the information of the instruments decision is lost. In this thesis, we have built a system of classification in an attempt to solve this problem. Each music instrument family data is composed of various pitches in 1 second. Also, duo-timbre and trio-timbre are mixed in order to generate mixtures and they serve as different labels. Their feature vectors are composed of a low-pass filtered power spectrogram, a high-pass filtered power spectrogram, a chromagram, and the time domain waveform. Several machine learning methods have been applied respectively, yet not all of the methods perform well. The k-nearest neighbors method has the most accurate result in both validation step (71.1%) and testing step (65.2%). We also have carried out a hearing test in order to understand whether the ability of classification for humans can compete with computers. As a result, humans’ accuracy is lower than computers’ in average.
摘要 i
Abstract ii
1. Introducion 1
1.1 Music instruments 2
1.2 Timbres 2
1.3 Music instrument families 5
1.4 Literature review 6
1.5 Motivation 7
2. Methods 10
2.1 Training databases 10
2.2 Feature extraction 11
2.2.1 Time domain features 13
2.2.2 Frequency domain features 13
2.2.3 Pooling 17
2.2.4 Summary 19
2.3 Machine learning algorithms 19
2.3.1 k-nearest neighbors 20
2.3.2 Support vector machines 21
2.3.3 Neural networks 22
2.3.4 Nearest neighbor of sparse coding 27
2.3.5 Principal components analysis 30
2.4 Testing database 30
2.5 Block diagrams 32
2.6 Hearing test 34
3. Results 36
3.1 Cross-validation 36
3.2 Testing 42
3.3 Hearing tests and overall accuracy 44
3.4 Summary 45
4. Discussion 47
4.1 classifying distribution 47
4.2 kNN versus NNSC 49
4.3 Hearing tests 51
5. Conclusion and future works 53
Reference 55
Appendix 57

[1] A. de Cheveigné, & H. Kawahara. (2002). YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4), 1917-1930
[2] A. Noll. (1967). Cepstrum Pitch Determination. Journal of the Acoustical Society America, 41(2), 293-309.
[3] T. Fujishima. (1999). Realtime chord recognition of musical sound: A system using common lisp music. In Proceedings of the International Computer Music Conference, 464-467.
[4] A. Sheh, & Daniel, P.W. Ellis. (2003). Chord Segmentation and Recognition using EM-Trained Hidden Markov Models. In Proceedings of the International Conference on Music Information Retrieval, 3, 183-189.
[5] Hung-Chen Chen, & Arbee, L. P. Chen. (2001). A music recommendation system based on music data grouping and user interests. Proceedings of the tenth international conference on Information and knowledge management, 231-238.
[6] Ja-Hwung Su, Hsin-Ho Yeh, Philip S. Yu, & Vincent S., Tseng. (2010). Music Recommendation Using Content and Context Information Mining. IEEE Intelligent Systems, 25(1), 16-26.
[7] J. C. Brown. (1999). Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. The Journal of the Acoustical Society of America, 105(3), 1933-1941.
[8] T. Kitahara, M. Goto, & H. G. Okuno. (2003). Musical instrument identification based on F0-dependent multivariate normal distribution. In Proceedings of Acoustics, Speech, and Signal Processing, 5, V-421.
[9] J. Marques, & P. Moreno. (1999) A study of musical instrument classification using Gaussian mixture models and support vector machines. Compaq, 99(4).
[10] A. Eronen. (2001). Comparison of features for musical instrument recognition. In Proc. IEEE Workshop Appl. Signal Process, Audio Acoust., 19–22.
[11] J. C. Brown, O. Houix, & S. McAdams. (2001). Feature dependence in the automatic identification of musical woodwind instruments. The Journal of the Acoustical Society of America, 109(3), 1064–1072.
[12] G. Agostini, M. Longari, & E. Poolastri. (2003). Musical instrument timbres classification with spectral features. EURASIP Journal on Applied Signal Processing, 2003(1). 5–14.
[13] I. Kaminskyj, & T. Czaszejko. (2005). Automatic recognition of isolated monophonic musical instrument sounds using kNN. Journal of Intelligent Information Systems, 24(2/3), 199–221.
[14] E. M. Hornbostel, & C. Sachs. (1914). Zeitschrift für Ethnologie German: Braunschweig, A. Limbach.
[15] S. Md Saha Goutam. (2012). Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Communication, 54(4), 543–565.
[16] http://impossible-music.wikia.com/wiki/Microsoft_GS_Wavetable_Synth
[17] B. Gold, N. Morgan, & D. Ellis. (2011). Speech and audio signal processing: processing and perception of speech and music. John Wiley & Sons.
[18] N. S. Roger. (1964). Circularity in judgments of relative pitch. Journal of the Acoustic Society of America, 36(212), 2346–2353.
[19] T. Cho, & J. P. Bello. (2014). On the relative importance of individual components of chord recognition systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(2), 477-492.
[20] B. E. Boser, I. M. Guyon, & V. N. Vapnik. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory, 144-152.
[21] M. B. Christopher. (2006). Pattern Recognition and Machine Learning (1st ed). America: Springer.
[22] Hsu Chih-Wei, & Lin Chih-Jen (2002). A Comparison of Methods for Multiclass Support Vector Machines. IEEE Transactions on Neural Networks.
[23] Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael, Ringgaard, & Chih-Jen Lin. (2010). Training and testing low-degree polynomial data mappings via linear SVM. J. Machine Learning Research, 11, 1471–1490.
[24] F. Pedregosa et al. (2011). Scikit-learn: Machine Learning in Python. JMLR 12, 2825-2830.
[25] D. E. Rumelhart, G. E. Hinton, & R. J. Williams. (1988). Learning representations by back-propagating errors. Cognitive modeling, 5(3), 1.
[26] S. Shai. (2011). Online Learning and Online Convex Optimization. Foundations and Trends® in Machine Learning, 107–194.
[27] J. Mairal, F. Bach, J. Ponce, & G. Sapiro. (2009). Online dictionary learning for sparse coding. In Proceedings of the 26th annual international conference on machine learning, 689-696.
[28] M. Schmidt. (2005). Least squares optimization with l1-norm regularization. CS542B Project Report of The University of British Columbia, 14-18.
[29] K. Pearson. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine, 2(6), 559–572.
[30] J. P. Bello et al. (2005). A tutorial on onset detection in music signals. IEEE Transactions on speech and audio processing, 13(5), 1035-1047.

註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
第一頁 上一頁 下一頁 最後一頁 top