跳到主要內容

臺灣博碩士論文加值系統

(44.221.73.157) 您好!臺灣時間:2024/06/20 21:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:顏伃伶
研究生(外文):Yu-ling Yen
論文名稱:資料擴充、後處理運算、及模糊集合對支持向量機預測次序類別資料之影響
論文名稱(外文):The Impacts of Dimension Extension, Postprocess, and Fuzzy Sets to Support Vector Machines on Predicting Ordinal Classes
指導教授:蕭文峰蕭文峰引用關係
指導教授(外文):Wen-Feng Hsiao
學位類別:碩士
校院名稱:國立屏東商業技術學院
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:61
中文關鍵詞:模糊集合次序類別支持向量機
外文關鍵詞:Ordinal ClassFuzzy Sets TheorySupport Vector Machine
相關次數:
  • 被引用被引用:0
  • 點閱點閱:263
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
次序尺度(ordinal scale)資料的預測問題在機器學習或資料探勘領域中一直是個困難仍待解決的問題。由於支持向量機(Support Vector Machines)已被證明是個相當穩健且運算績效良好的演算法,其運用在多類別預測及迴歸預測上都有很好的表現,因此近年來許多學者紛紛嘗試用其來解決次序類別的預測問題。有些學者主張將次序類別視為連續數值,而從次序迴歸(ordinal regression)的角度來解決此問題;有學者則主張將次序數值視為離散類別,而從次序分類(ordinal classification)的角度來解此問題。本研究認為這兩種看法都各有優缺點,但認為當類別數目少時,以離散型來看待較適合;當類別數目多時,則以連續數值來看待則較合理。然囿於時間因素,本研究現階段僅以離散型數值進行探討。
在許多現有以離散型角度出發的方法中,Cardoso and da Costa (2007)提出「次序支持向量機(ordinal Support Vector Machine, oSVM)」的概念,主張將資料複製至高維度空間(由增加dummy variable達成),並以類別梯形法重新定義二元分類的正負類別,以讓支持向量機能同時區分出相鄰類別的邊際。他們並由實驗證明所提的方法的預測正確率比現有的次序分類方法都還要好。由於其概念相當直覺且具說服力,因此本論文主要是基於此方法進行研究,並提出可能的改善方法。
由實際測試oSVM在不同資料集的表現上發現,當資料集中的輸入屬性與測試屬性間僅有部份或根本沒有函數關係時(例如,真實資料集),其表現相當不理想,其分類錯誤會比較大,例如將第x類預測成第y類(|y-x|>1);當資料集中的輸入屬性與測試屬性間有明顯的函數關係時(例如,人工合成的資料集),通常會表現的比較好,其分類錯誤都比較小。例如將第x類預測成第x-1或x+1類(即上下一個等級)。我們認為第一類錯誤可能是資料集存在太多雜訊,以致oSVM所求得的切割超平面不佳。過去研究指出SVM相當容易受雜訊影響,而模糊集合正可克服SVM易受雜訊影響的問題。而我們認為第二類錯誤是由於類別邊際的模糊性所造成的,此可由定義範例屬於各類別的程度來改善。
為同時解決這兩類的問題,我們設計了三種不同的模糊歸屬函數。首先,為克服雜訊,我們認為可以由定義各類別的中心點(可以由該類別所有範例來計算各屬性的平均值或中位數)表示該類別的核心概念。範例如愈接近(計算歐幾里德距離)中心點,代表愈是此類別的概念,其歸屬度值應該愈接近1;愈遠離中心點,愈有可能是雜訊,故的歸屬度應該愈接近0。第二個歸屬函數則進一步考慮各範例與切割超平面間的距離,並定義此距離除以中心點對超平面的距離來當成此範例的歸屬度值。當歸屬度值大於1者就設其為1;當歸屬度值小於0者就設其為0。如此的做法會將一個類別的範例以中心點為分界,比中心點要遠離切割超平面的一邊的資料點其歸屬度值皆為1,分錯邊的範例的歸屬值設為0(視為雜訊點,重要性=0)。其對應的意義是我們認為愈靠近切割超平面的點可能是雜訊點。其次,為克服類別邊界的模糊性(第二類錯誤),我們定義了第三個歸屬函數來決定範例屬於各類別的歸屬程度。具體做法是由定義模糊寬度來改變歸屬函數的形狀:以類別中心為梯形歸屬函數的中心點,向左右各擴展適當的模糊寬度(介於0.5~1.5個標準差,標準差是指該類別所有資料點對超平面距離值的標準差)當成梯形的上底,而下底則固定是延伸至至下一類別的左側所擴展的位置。並由計算資料點對超平面距離值落在梯形歸屬函數的位置來決定其歸屬度值。
由實驗結果顯示本論文所提方法比傳統方法OrdinalClassifier或SVR都要好。其中,第三個歸屬函數由於可由控制模糊寬度來微調分類器,故其表現是三者中最好的。然而我們也發現,對於第二類錯誤,現有的三種歸屬函數的改善幅度都有限。因此,未來研究將朝建立boosting模式來提昇預測正確率。
The prediction of ordinal scale data is a long-standing, difficult, and unsolved problem in the machine learning/data mining research. Support Vector Machine (SVM) have demonstrated itself a robust and well-performed algorithm. It has successfully been applied to the predictions of multiple-class and regression problems. Therefore, several researchers try to apply it to the prediction of ordinal-class problem. Among them, some researchers proposed to deem ordinal values as continuous metric, and tried to solve the problems from the view of ordinal regression; other researchers proposed to deem ordinal values as discrete categories, and tried to solve the problems from the view of ordinal classification. We believe both of the two views have their pros and cons, but tend to adopt an adaptive view, where the ordinal classification should be adopted when the number of class values is small, and the ordinal regression should be adopted when the number of class values is large. But because of time limitation, this research only focuses on the ordinal classification problem.
Among those ordinal classification researches, Cardoso and da Costa (2007) proposed a method, called ordinal support vector machine (oSVM), in which the data are replicated into higher dimensions by inserting dummy variables, and the data's outputs are redefined to +1 or -1 for binary classification according to their level at the classes' ladder. In this way, the support vector machine can simultaneously learn the margins of two neighboring classes. They demonstrated by experiments that the prediction accuracies of their method outperform other methods. Owing that their concept is quite straight-forward and convincing, this research is based on their method and propose some improvements.
By empirically testing the performance of oSVM on several datasets, we found that when the input attributes and the output attribute have only partly or no functional relationship (such as real datasets), the performance of oSVM is usually not quite well. The classification errors are usually wrongly classified class x into class y, where y is larger or smaller than x by 2 or above rank. But when the inputs and the output have obvious functional relationship (such as the synthetic dataset), oSVM usually performs better. The classification errors are usually due to near-class misclassification as classifying class x to class x-1 or class x+1 (upper or lower one rank). We also believe that the causes of first error are that there is too much noise in the dataset such that oSVM cannot learn a good hyperplane. Past researches have pointed out that SVM is easily deteriorated by noisy data, and the fuzzy set theory is good for resolving the problem of noise. Furthermore, we believe the second error is caused by the vagueness between boundaries, and therefore can be relieved by defining the degree of class membership for each instance.
To resolve the above mentioned two problems, we designed three different fuzzy membership functions. First, to conquer noise, we define the centroid point, whose attributes are obtained by calculating either the mean or the median of each attribute of all instances in this class) to represent the core concept of each class. The closer of a sample to the centroid point (defined by the Euclidean distance) is, the more representative the sample is, and thus the membership degree is more close to 1. The more distance of sample to the centroid point, the more likely it is a noise, and thus, the membership degree is more close to 0. The second membership function further considers the distance of a sample to the discriminant hyperplane of the SVM. We define the membership degree as the distance of the sample to the hyperplane divided by the distance of the centroid to the hyperplane. Furthermore, the membership degree is set to 1 if its value is larger than 1, and set to 0 if its value is smaller than 0. By doing so, we are effectively using the centroid as the splitting point. Any sample whose distance to the hyperplane is farther than the distance of the centroid to the hyperplane will have a membership degree 1. Any sample which resides in the different side from the centroid will have a membership degree 0 (this kind of samples is deemed as noise, so its importance is set to 0). The reason behind this process is that we believe only points near to the hyperplane are likely to be noises. Second, to resolve the vagueness between class boundaries (the second error), we define our third membership function to decide the membership degree of an instance to each class. The specific procedure is by defining fuzzy width to change the span of membership function: the centroid point is set as the center of the trapezoidal membership function, and from which we further add proper fuzzy width (between 0.5 and 1.5 standard deviation of the distances of all the points to the hyperplane) to the left and to the right as the upper side the trapezoid, and from which we further expand to the centroids of nearest classes (one for each side) as the lower side the trapezoid. The membership degree is decided by the position of the instance on the trapezoidal function.
The experiment results show that our proposed method, Fuzzy Ordinal Support Vector Machine, can effectively reduce the error rate and the mean absolute error of oSVM, and perform better than traditional methods of OrdinalClassifier or Support Vector Regression. Among the three membership functions, membership function three, with a parameter to control the fuzzy width to further fine-tune the classifier, perform best. And membership function two is better than membership function one in most cases. However, we also find that the effects the three membership functions on reducing the second error are not salient. Therefore, the future study can try to build a boosting-like model to improve the prediction accuracy.
1. 緒論 1
1.1. 研究背景與動機 1
1.2. 研究目的 3
2. 文獻探討 4
2.1. Ordinal類別的分類 4
2.1.1. 加入後處理之迴歸 4
2.1.2. Ordinal decision tree 5
2.1.3. 其他相關的ordinal類別的分類法之文獻 7
2.2. Support Vector Machine (SVM) 10
2.2.1. Support Vector Machine for Classification (SVC) 11
2.2.2. Support Vector Machine for Regression (SVR) 15
2.2.3. Support Vector Machine for Ordinal data (oSVM) 18
2.3. 模糊集合理論(Fuzzy Set) 25
2.3.1. 模糊集合之定義 25
2.3.2. 歸屬函數之類型 27
2.3.3. 不同尺度等級之介紹 30
2.4. 模糊支持向量機 32
3. 研究方法 35
3.1. 前處理、後處理 35
3.2. 模糊集合理論之歸屬函數 38
3.2.1. 歸屬函數一 39
3.2.2. 歸屬函數二 40
3.2.3. 歸屬函數三 42
4. 實驗結果與討論 44
4.1. 實驗比較 45
4.2. 參數調整 46
4.3. 實驗 48
4.3.1. 實驗一 48
4.3.2. 實驗二 51
5. 結論與未來研究方向 54
6. 參考文獻 55
7. 附錄 58
7.1. 歸屬函數之原始碼 58
7.1.1 歸屬函數一之Matlab原始碼 58
7.1.2 歸屬函數二之Matlab原始碼 59
7.1.3 歸屬函數三之Matlab原始碼 60
7.2. Auto MPG資料集於FoSVM之C值倍數測試 61
1.趙俊喨(2005),「應用模糊Logistic模式樹於次序尺度分類」,碩士論文,國立屏東商業技術學院資訊管理研究所,屏東。
2.Bennett, K. P., and Mangasarian, O. L.(1992), "Robust linear programming discrimination of two linearly inseparable sets," Optimization Methods and Software 1, pp. 23–34 (Gordon & Breach Science Publishers).
3.Cardoso, J. S., and da Costa, J. F. P. (2007), "Learning to classify ordinal data: The data replication method," Journal of Machine Learning Research, Vol. 8, pp. 1393–1429.
4.Cardoso, J. S., da Costa, J. F. P., and Cardoso, M. J. (2005), "Modelling ordinal relations with SVMs: An application to objective aesthetic evaluation of breast cancer conservative treatment," ScienceDirect, Neural Networks, Vol. 18, Issues 5-6, pp. 808-817.
5.Chang, C.-C., and Lin, C.-J. (2001), LIBSVM: a library for support vector machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm.
6.Chu, W. and Keerthi, S. S. (2007), "Support vector ordinal regression," Neural Computation, Vol. 19, No. 3, pp. 792–815.
7.Cortes, C., and Vapnik, V. (1995), "Support-Vector Networks," Machine Learning, Vol. 20, pp. 1-25.
8.Foody, Giles M., and Mathur A. (2004), "Toward intelligent training of supervised image classifications: directing training data acquisition for SVM classification," Remote Sensing of Environment, Vol. 93, Issues 1-2, pp. 107-117.
9.Frank, E., and Hall, M. (2001), "A simple approach to ordinal classification," Proceedings of the 12th European Conference on Machine Learning , pp. 145-156. Berlin: Springer-Verlag.
10.Freund, Y., Iyer, R., Shapire, R. E., and Singer, Y. (2003), "An efficient boosting algorithm for combining preferences," Journal of Machine Learning Research, Vol. 4, pp. 933–969.
11.Herbrich, R., Graepel, T., and Obermayer, K. (1999), "Regression models for ordinal data: a machine learning approach," Technical report, TU Berlin, Germany.
12.Huang, K.Y. (2002), "The use of a newly developed algorithm of divisive hierarchical clustering for remote sensing image analysis," International Journal of Remote Sensing, Vol. 23, No. 16, pp. 3149 – 3168
13.Joachims, T. (2004), SVMlight V6.01, http://svmlight.joachims.org/.
14.Klir, G. J., and Yuan, Bo (1995), Fuzzy sets and fuzzy logic: theory and applications, Prentice Hall PTR, Upper Saddle River, NJ, pp. 11-19.
15.Kramer, S., Widmer, G., Pfahringer, B., and de Groeve, M.D. (2000). ), "Prediction of ordinal classes using regression tree," in Z.W. Ras & S. Ohsuga(Eds), Proceedings of 12th International Symposium, ISMIS 2000 Charlotte, NC, USA, pp. 665-674. Berlin: Springer.
16.Landwehr, N., Hall, M. and Frank, E. (2003)., ”Logistic Model Trees,” http://www.cs.waikato.ac.nz/~ml/publications/2003/landwehr-etal.pdf
17.Lee, J., Yeung, D., and Wang, X. (2003), "Monotonic decision tree for ordinal classification," IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp.2623–2628.
18.Li, L. and Lin, H.-T. (2007), "Ordinal regression by extended binary classification," in B. Sch¨olkopf, J. C. Platt, and T. Hofmann (Eds.), Advances in Neural Information Processing Systems: Proceedings of the 2006 Conference, Vol. 19, pp. 865–872. MIT Press.
19.Lin, C.-F., and Wang, S.-D. (2002), "Fuzzy Support Vector Machines," IEEE TRANSACTIONS ON NEURAL NETWORKS, Vol. 13, No. 2, pp 464-471.
20.Lin, H.-T. and Li, L. (2006), "Large-margin thresholded ensembles for ordinal regression: Theory and practice," in J. L. Balcaz´ar, P. M. Long, and F. Stephan (Eds.), Algorithmic Learning Theory, Vol. 4264 of Lecture Notes in Artificial Intelligence, pp. 319–333. Springer-Verlag.
21.McCullagh, P., and Nelder, J. A. (1989), Generalized linear models, New York: Chapman and Hall.
22.Parrella, F. (2007), "Online Support Vector Regression," the degree of Information Science, Department of Information Science, University of Genoa, Italy.
23.Rajaram, S., Garg, A., Zhou, X. S., and Huang, T. S. (2003), "Classification approach towards ranking and sorting problems," in N. Lavrac, D. Gamberger, L. Todorovski, and H. Blockeel (Eds.), Machine Learning: Proceedings of the 14th European Conference on Machine Learning, Vol. 2837 of Lecture Notes in Computer Science, pp. 301–312. Springer-Verlag.
24.Schölkopf, B. and Smola, A. J. (2002), "Learning with Kernels," MIT Press, Cambridge, MA.
25.Schölkopf, B., Bartlett, P., Smola, A., Williamson, R. (1998), "Support Vector Regression with Automatic Accuracy Control," Proceedings of ICANN'98, Perspectives in Neural Computing, pp. 111-116.
26.Shafri, H.Z.M., and Ramle, F.S.H. (2009), "A Comparison of Support Vector Machine and Decision Tree Classifications Using Satellite Data of Langkawi Island," Information Technology Journal, Vol. 8, Issue 1, pp. 64-70.
27.Shashua, A. and Levin, A. (2003), "Ranking with Large Margin Principle: Two Approaches," in Neural information and processing systems (NIPS), pp. 937–944.
28.Shilton, A., and Lai, D. T. H. (2007), "Iterative Fuzzy Support Vector Machine Classification," Fuzzy Systems Conference, FUZZ-IEEE 2007, IEEE International, pp.1-6.
29.Smola, A. J., and Schölkopf, B. (2003), "A tutorial on support vector regression," Statistics and Computing, Vol. 14, No. 3., pp. 199+.
30.Tan, C.P., Lani, N.F.M., and Lai, W.K. (2008), "Application of Support Vector Machine Classifier for Security Surveillance System," Advances in Computer Science and Technology, Langkawi, Malaysia.
31.Vapnik, V. (1999), The Nature of Statistical Learning Theory, 2nd edition, Springer-Verlag, New York, pp. 132-183.
32.Wikipedia (2009), Online Repository, Support Vector Machines, http://en.wikipedia.org/wiki/Support_vector_machine.
33.Xia, F., Tao, Q., Wang, J., and Zhang, W. (2007a), "Recursive feature extraction for ordinal regression," Neural Networks, IJCNN 2007, pp. 78–83.
34.Xia, F., Zhou, L., Yang, Y., and Zhang, W. (2007b), "Ordinal regression as multiclass classification," International Journal of Intelligent Control and Systems, Vol. 12, No. 3, pp. 230–236.
35.Zadeh, L. A. (1965), "Fuzzy Sets", Information and Control, Vol. 8, No. 3, pp. 338-353.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top