研究生(外文):Duo-Fu Bao
論文名稱(外文):Supervised and Unsupervised Music Genre Classification
外文關鍵詞:Music genreUnsupervised ClassificationGaussian mixture modelPrincipal components analysisHierarchical Agglomerative Clustering
Explosive growth in the Internet and digital media has motivated recent research into developing techniques for helping users locate their desired music styles or genres from numerous options. Existing systems for automatic genre classification follows a supervised framework that extracts genre-specific information from manually-labeled music data. However, such systems may not be suitable for personal music management, because manually labeling music by genre can be labor intensive and subject to the discrepancy between individuals. In this paper, we study an unsupervised paradigm for music genre classification. It is aimed to partition a collection of unknown music recordings into several clusters such that each cluster contains recordings of only one genre, and different clusters represent different genres. This enables users to organize their personal music database without needing specific knowledge about genre. To attain such a partitioning, we develop several methods for measuring the similarities between music recordings. They all start by representing each music recording as a Gaussian mixture model (GMM), and computing the likelihood that every recording tests for every GMM. Then, three inter-recording similarity measurements based on likelihoods are derived, namely, cross likelihood ratio, Euclidean distance, and cosine distance. We further propose using principal component analysis to enhance the similarity measurement. By applying the hierarchical agglomerative clustering, music recordings are partitioned as a tree of clusters. The Rand index are then estimated for each branch of the cluster tree. Motivated by the fact that the minimal value of the Rand index only appears when the number of clusters equals the true number of genres, we propose determining the optimal number of clusters by searching for the branch of the cluster tree that produces the minimal value of the Rand index. Our experiment results show the feasibility of clustering music recordings by genre.
摘 要 i
誌 謝 iv
目 錄 v
表目錄 vii
圖目錄 viii
第一章 緒論 1
1.1研究動機與目的 1
1.2研究方法 2
1.3章節介紹 4
第二章 相關知識 5
2.1前言 5
2.2相關研究 5
2.3督導式曲風分類 8
2.4非督導式曲風分類 9
第三章 督導式音樂曲風分類 11
3.1系統架構 11
3.2音色結構特徵 12
3.2.1 預處理(Pre-Process) 13
3.2.2 離散傅利業轉換(Discrete Fourier Transform, DFT) 14
3.2.3 三角帶通濾波器(Triangular Filter Bank) 14
3.2.4梅爾倒頻譜係數(Mel Frequency Cepstral Coefficients) 16
3.2.5頻譜質量中心(Spectrum Centroid) 17
3.2.6 雷尼熵(Renyi Entropy) 17
3.2.7頻譜變遷(Spectrum Flux) 17
3.2.8頻譜滾邊(Spectrum Rolloff) 17
3.2.9 差分係數(Delta coefficient) 18
3.3高斯混合模型(Gaussian Mixture Model) 18
3.4調適性高斯混合模型(Adaptive Gaussian Mixture Model) 20
3.5督導式訓練實驗結果 20
3.5.1前言 20
3.5.2實驗資料來源 23
3.5.3 督導式曲風分類結果 24
3.5.4調適性高斯混合模型曲風分類比較 28
第四章 非督導式音樂曲風相似性量測與分群 29
4.1系統架構 29
4.2 分群效能評估方式 30
4.3音樂曲風相似性量測 32
4.4音樂曲風距離量測 34
4.5主成分分析(Principal Components Analysis, PCA) 37
4.6依曲風相似性分群音樂片段 39
4.7自動決定群數 41
4.8非督導式分群實驗結果 44
4.8.1實驗資料 44
4.8.2 HAC中計算相似性關係的比較 44
4.8.3 PCA對於分群結果的比較 46
4.8.4分群實驗結果 48
4.8.5自動分群結果 51
第五章 結論與未來展望 52
參考文獻 53
