研究生(外文):Chen, I-Ling
論文名稱(外文):Combining Ensemble Technique of Support Vector Machines with the Optimal Kernel Method for High Dimensional Data Classification
指導教授(外文):Kuo, Bor-Chen
口試委員(外文):Chang, Jyh-YeongTaur, Jin-ShiuhYang, Jinn-Min
外文關鍵詞:pattern recognitiondynamic subspace methodoptimal kernelSVM
近年來,支撐向量機與融合分類器已被廣泛且成功地運用於改善高維度資料辨識的效能並解決Hughes現象所造成的問題。許多研究證實多重辨識器系統,如隨機子空間法與動態子空間法,利用生成不同特徵子空間建構一群具有多樣性的基底辨識器,可減緩小樣本高維度的顧慮,得到比單一辨識器更好的辨識效果;此外,許多研究亦顯示支撐向量機為一種完善且有效的分類器,並且作為上述兩種多重辨識器系統中的基底辨識器,支撐向量機也可以獲得很不錯的分類正確率。然而,控制支撐向量機分類效能的主要因素為kernel function。因此,選取一個合適的kernel function或挑選適合kernel function的參數對支撐向量機而言是相當重要的。
本研究將整合上述方法的優勢,針對支撐向量機分類器,利用最佳化核函數的方法,發展一個適合高維度資料分析的多重辨識器系統,並提出一個融入最佳化核函數方法自動化挑選子空間維度數及特徵空間之多重支撐向量機。藉由一個自動化選取RBF kernel function最佳參數的方法,找出適合各維度資料進行分類的核化空間,並且在子空間選取的步驟當中引入動態子空間法的概念,加入兩個重要性密度分布函數,分別用來自動化的選取子空間維度數,以及選取該子空間的特徵,希望藉此增強已發展的動態子空間法之辨識效果。由實驗結果得知,此研究提出的方式在選取較適的kernel函數上有較佳的表現,相較於DSM而言,在縮短運算時間和提升辨識正確率之目標上,都有較為突破的效果。

In recent years, the support vector machines (SVM) and combining classifiers are widely and successfully used to improve the classification on high dimensional data and solve Hughes phenomenon. Many researches have demonstrated that multiple classifier systems or so-called ensembles can alleviate concern occurs from small sample size and high dimensionality data and obtain more outstanding and robust results than single models. Examples are the random subspace method (RSM) and dynamic subspace method (DSM) which are both effective approaches for generating an ensemble of diverse base classifiers via different feature subsets. In addition, SVM can be used as the base classifier which is considered useful and effective classifier in the two methods mentioned above to achieve higher classification accuracy rate.
However, the performance of SVM is influenced greatly based on choosing the proper kernel functions or proper parameters of a kernel function. Therefore, the objectives of this research are to develop an ensemble technique based on SVM via the optimal kernel method and propose a novel subspace selection mechanism, named the kernel-based dynamic subspace method (KDSM). KDSM combines the optimal kernel method with all superiorities of DSM that is improved on classification outcomes based on SVM. The experimental results show that the proposed method obtains sound performances than the other conventional methods; moreover, compared with the DSM, there are outstanding results not only in improving accuracy of classification but also in reducing the computation time.

1.1 Statement of Research 1
1.2 Organization of Thesis 4
1.3 Major Notation and Acronyms 6
2.1 Ensemble Method 8
2.1.1 Random subspace method 11
2.1.2 Dynamic subspace method 15
2.2 Support Vector Machine 19
2.2.1 Kernel Method 20
2.2.2 SVM Algorithm 21
2.3 An Optimal Kernel Method for selecting RBF Kernel Parameter 24
3.1 Importance Distribution of Band Membership 29
3.2 Importance Distribution of Dimensionality Weight 33
3.3 Optimal Kernel-based Dynamic Subspace Ensemble 36
4.1 Experimental Design 41
4.1.1 Datasets of experiment 43
4.2 Experimental Results 48

