研究生(外文):Jen-Feng Huang
論文名稱(外文):A SA-Based Feature Selection and Parameter Optimization Approach for pport Vector Machine
指導教授:林 詩 偉
指導教授(外文):Shih-Wei Lin
外文關鍵詞:Support Vector MachineGrid SearchSimulated AnnealingFeature selection
本研究使用模擬退火法及grid search傳統參數搜尋方式進行測試UCI資料集,並與其它文獻所提出的方法做比較。此外,利用主成分分析來進行屬性篩選與本研究提出SA-SVM屬性篩選相比較其效果。實驗結果顯示在無屬性篩選方面以SA-SVM為基礎的所測得分類正確率優於grid search及相關文獻。在屬性篩選方面,本研究所提出SA-SVM測得分類正確率也優於主成分分析。因此SA-SVM可幫助SVM搜尋最佳參數組合及屬性集合。
Support Vector Machine (SVM) is a new technique for pattern classification, and can be used in many applications. Kernel parameters setting in the SVM training process, along with the feature selection, will significantly impact the classification accuracy. The objective of this research was to simultaneously optimize the parameters while finding a subset of features without degrading SVM classification accuracy. A simulated annealing (SA) based approach denoted as SA-SVM for parameter optimization and feature selection of SVM is developed.
Several UCI datasets are tested using the SA-based approach and the grid search which is a traditional method of performing parameter setting. And the developed SA-based approach was compared with grid search and other approaches. Besides, the SA-based approach for feature selection was compared with the principle component analysis. Without the feature selection, the experimental result shows that the classification accuracy rates of SA-SVM are better than those of grid search and other approaches. As for the feature selection, the classification accuracy rates of SA-SVM are also better than those of the principle component analysis. Therefore, the proposed SA-SVM approach is useful for parameter optimization and feature selection for SVM.
誌 謝………………………………………………………………… Ⅰ
摘 要………………………………………………………………… Ⅱ
目 錄………………………………………………………………… Ⅲ
表 錄………………………………………………………………… Ⅵ
圖 錄………………………………………………………………… Ⅶ
一、緒論……………………………………………………………… 1
1.1 研究背景與動機…………………………………………….. 1
1.2 研究目的…………………………………………………….. 3
1.3 研究限制…………………………………………………….. 3
1.4 研究流程…………………………………………………….. 4
二、文獻探討…………………………………………………………. 6
2.1 支援向量機………………………………………………….. 6
2.1.1 線性支援向量機………………………………….….. 6
2.1.2 非線性支援向量機………………………………….... 8
2.1.3 不完美分割非線性支援向量機……………………… 9
2.2 模擬退火法………………………………………………….. 11
2.3 Hide-and-seek SA……………………………………………. 13
2.4 搜尋初始解方式……………..…………………………….... 15
2.5 屬性篩選…………………………………………………….. 16
2.5.1 Wrapper model……………………………................... 17
2.5.2 Filter model……………………………………………. 18
2.6 相關參數搜尋方法………………………………………….. 20
2.7 主成份分析基本概念……………………………………….. 22
2.8 相關文獻探討……………………………………………….. 25
三、研究方法………………………………………………………….. 28
3.1 編碼方式……………………………………………………... 28
3.2 尋求SVM分類最佳正確率之實驗步驟…………………… 28
3.3 資料處理與環境…………………..……………………….… 32
3.3.1 開發環境…………………….………………………... 32
3.3.2 資料前處理與正規化……………………………….... 32
3.3.3 資料分類方法……………………………………….... 33
3.4 參數選擇…………………………………………………...... 35

四、實驗結果……………………………………………………….… 35
4.1 實驗結果…………………………………………………… 39
4.1.1 文獻比較…………………………………………….. 39
4.1.2 各方面實驗結果之檢定…………………………….. 41
五、結論與後續研究…………………………………………………. 47
參考文獻………………………………………………………………. 49
