跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.172) 您好!臺灣時間:2025/03/16 04:15
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:施婉婷
研究生(外文):Wan-Ting Shih
論文名稱:普適提之隨機性弱分類器探討
論文名稱(外文):Random Base Learners for Boosting
指導教授:曹振海曹振海引用關係
指導教授(外文):Chen-Hai Tsao
學位類別:碩士
校院名稱:國立東華大學
系所名稱:應用數學系
學門:數學及統計學門
學類:數學學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:英文
論文頁數:44
中文關鍵詞:決策樹random coordinate descent普適提感知機
外文關鍵詞:perceptrondecision treeBoostingrandom base learner
相關次數:
  • 被引用被引用:1
  • 點閱點閱:330
  • 評分評分:
  • 下載下載:52
  • 收藏至我的研究室書目清單書目收藏:0
普適提演算法是近幾年來受到歡迎的機器學習方法之一。它的概念是結合許多基本分類器轉而生成出一個較強的分類器。因此,在普適提演算法中用來搭配使用的基本分類器(base learner)就扮演著相當重要的角色。然而如何選擇這些基本分類器也成為普適提演算法中的重要課題之一。在目前的研究中最常被用來使用的基本分類器是決策樹。另外,Breiman (1996) 也認為使用決策樹與普適提搭配是“best off-the-shelf classifier in the world”。面對這樣的現象,我們更進一步的想知道是不是存有其它的基本分類器能做為替代來使用。Li(2005)提出的random coordinate descent (RCD)演算法是以感知機(perceptron)作為基礎再進一步加入隨機性概念而產生的分類器。有別於以往使用近似的損失函數來做計算,這個演算法則是採用直接求解的方式求出能使錯誤率達到最小的解,另外,從Li (2005)的實驗結果看來,RCD 與普適提搭配使用也有不錯的表現結果。但較為可惜的是在它的實驗結果中並沒有使用決策樹搭配普適提演算法來做比較。在本篇論文中,我們嘗試補足這部分的比較,另外再透過模擬資料的生成進一步探討分別使用決策樹與RCD 搭配普適提演算法的表現結果,最後根據我們的實驗結果提出一些RCD 的修正方法。
Boosting algorithm combines (weak) base learners to produce a stronger and more effective learner. The choice of base learners plays an important role in boosting algorithm. Breiman (NIPS workshop, 1996) calls AdaBoost with tree the “best off-the-shelf classifier in the world”. While the practice of using tree-based learners such as decision stump as the de facto base learner is overwhelming, searching good alternative base learners remains an important research topic from both theoretical and practical perspectives. Recently, Li (2005) proposes a random coordinate descent (RCD) base learners by extending the idea of perceptron combining with dimension reduction. Contrast to many current base learners which usually minimize some smooth approximations of the expected loss, RCD base learners directly minimize the (nonsmooth) training error. Experiments provided in Li (2005) indicates the comparable or better performance of RCD base learners with some base learners in terms of its computational efficiency and testing errors. Unfortunately, the most widely used tree-based base learners are not compared therein. In our study, the experiments with tree-based learners in similar settings of Li (2005) are supplemented. In ddition, our experiments and benchmark data analysis on RCD base learners with AdaBoost suggest that cautions should be taken in practical implementation. Modification of RCD base learners and its comparison with some random base learners are investigated.
1 Introduction
1.1 Boosting
1.2 Motivation

2 Base Learners
2.1 Classi fication Problems
2.2 Review of Base Learners
2.2.1 Decision Tree
2.2.2 Naive Regression
2.2.3 Logistic Regression
2.2.4 Support Vector Machine (SVM)
2.2.5 Perceptron
2.3 Random Coordinate Descent Algorithm (Li (2005))

3 RCD and Its Variants
3.1 Randomness and Variations in RCD
3.2 Randomization Constrained by Principal Components
3.3 Variable Selection

4 Numerical Results
4.1 UCI Machine Learning Data
4.2 Simulation Data
4.3 Bayes Procedure

5 Conclusion

Agresti, A. (1990). Categorical Data Analysis, New York, 2nd Edition: John Wiley.

Breiman, L. (1998). Arcing Classifiers (with Discussion). Annals of Statistics, 26, 801-849.

Breiman, L. (2004). Population Theory for Boosting Ensembles. Annals of Statistics, 32, 1-11.

Buhlmann, P. and Yu, B. (2008). Response to Mease and Wyner: Evidence Contrary to The Statistical View of Boosting. Journal of Machine Learning Research, 9, 187-194.

Freund, Y. and Schapire, R. E. (1997). A Decision-Theoretic Generalization of On-Line Learning and An Application to Boosting. Journal of Computer and System Sciences, 55, 119-139.

Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics, 28, 337-407.

Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning - Data Mining, Inference, and Prediction. Springer, New York. 1st Edition.

Jiang, W. (2004). Process consistency for the AdaBoost. Annals of Statistics, 32, 30-55.

Li, L. (2005). Perceptron Learning with Random Coordinate Descent. California Institute of Technology, Pasadena, CA. Computer Science Technical Report Caltech CSTR: 2005.006, 2005.

Lee, J.C. (2005). Some Statistical Aspects of Credit Scoring. International Association for Statistical Computing 3rd World Conference on Computation Statistics Data Analysis.

Mease, D. and Wyner, A. (2008). Evidence Contrary to The Statistical View of Boostiog. Journal of Machine Learning Research, 9, 131-156.

Meir, R. and Ratsch, G. (2003). An Introduction to Boosting and Leveraging. Advanced Lectures on Machine Learning, LNCS, Springer, 119-184.

Rosenblatt, F. (1958). The Perceptron: A Probabilistic Model for Information Storage and Organization in The Brain. Psychological Review, 65, 386-408.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top