研究生(外文):CHEN, PO-TING
論文名稱(外文):A Study on Classification of Drum Sound Characteristics Based on Convolutional Neural Networks
指導教授(外文):HSU, CHIH-MING
外文關鍵詞:Drum Sound ClassificationCharacteristicsDeep LearningCNN
首先,對音頻樣本進行前處理,再透過這些音頻樣本的各特徵提取進行常態分佈與差異分析比較,以確定不同特徵集對於不同類型分類的有效性和獨立性。而特徵選定後,使用了Ablation Study評估不同特徵組合對分類模型性能的影響,通過系統性地移除或修改模型的特徵組合,來評估關鍵特徵重要性以對模型整體性能之貢獻。本文使用的卷積神經網路(CNN)執行架構相同,研究目的是要比較增加不同的鼓特徵,從而實現僅以單一特徵有更高的準確率表現。

This study proposes a composite drum sound feature combination classification model aimed at identifying key drum sound features and improving classification accuracy. Based on a publicly available drum sound database, This research analysis focused on a total of 2,746 drum sound samples, including types such as bass drums, snare drums, cymbals, and claps.
First, the audio samples were preprocessed, and then feature extraction was performed on these samples to conduct normal distribution and difference analysis, ensuring the effectiveness and independence of different feature sets for various drum types. Once the features were selected, an Ablation Study was used to assess how different feature combinations affect the performance of the classification model. By systematically removing or modifying feature combinations, this method evaluates the importance of each key feature and its contribution to the overall performance of the model. The convolutional neural network (CNN) used in this study has the same architecture, the goal is to compare different drum features to see if using just one feature can achieve higher accuracy.
Finally, the research results were analyzed in detail using a confusion matrix to compare the performance differences between the MRR and MMM model. The results showed that the MRR model achieved an overall classification accuracy of 95%, which is better than the MMM model's 93%, demonstrating the superiority and effectiveness of the proposed method.

摘要 i
誌謝 iv
表目錄 viii
圖目錄 ix
第一章 緒論 1
1.1 研究背景 1
1.2 文獻回顧 1
1.3 研究動機與目的 3
1.4 論文架構 4
第二章 特徵化處理 5
2.1 音頻前處理 5
2.2 特徵化處理 6
2.2.1 Mel spectrogram特徵處理 6
2.2.2 Chroma特徵處理 9
2.2.3 RMS Energy特徵處理 13
2.2.4 Spectral Centroids特徵處理 16
2.3 特徵數據收集分析 19
2.3.1 特徵數據總平均值分析 19
2.3.2 常態分佈圖分析 21
2.3.3 特徵綜合分析可行性 27
第三章 卷積神經網路 29
3.1 資料處理 29
3.2 模型訓練 30
3.3 Ablation study 34
第四章 實驗結果與分析 35
4.1 模型訓練結果 35
4.2 模型訓練結果分析 37
4.2.1 MRR模型結果 37
4.2.2 MMM模型[14]結果 37
4.2.3 最佳MRR模型與MMM模型比較分析 38
4.2.4 MMR模型結果 39
4.2.5 MCR模型結果 39
4.2.6 MMC模型結果 40
4.2.7 MCC模型結果 41
4.2.8 CCC模型結果 42
4.2.9 CCR模型結果 42
4.2.10 CRR模型結果 43
4.2.11 RRR模型結果 44
4.2.12 MRR模型之組合順序影響分析 45
4.3 實驗結果比較 47
4.3.1 MRR模型之混淆矩陣結果分析 47
4.3.2 MMM模型[14]之混淆矩陣結果分析 48
4.3.3 MRR模型與MMM模型之分類性能比較 50
4.3.4 MRR模型與MMM模型之整體分類性能比較 51
第五章 結論與未來展望 53
5.1 研究結論 53
5.2 未來展望 55
參考文獻 56
