跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.23) 您好!臺灣時間:2025/10/26 07:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃天亮
研究生(外文):Tian-Liang Huang
論文名稱:二階正規化多標籤線性分類器比較
論文名稱(外文):Comparison of L2-Regularized Multi-Class Linear Classifiers
指導教授:林智仁林智仁引用關係
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:英文
論文頁數:30
中文關鍵詞:線性分類模型線性支持向量機多標籤分類最大熵方法座標下降法
外文關鍵詞:linear classificationlinear support vector machinesmulti-class classificationmaximum entropycoordinate descent
相關次數:
  • 被引用被引用:0
  • 點閱點閱:365
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
The classification problem appears in many applications such as document classification and web page search. Support vector machine(SVM) is one of the most popular tools used in classification task. One of the component in SVM is the kernel trick. We use kernels to map data into a higher dimentional space. And this technique is applied in non-linear SVMs. For large-scale sparce data, we use the linear kernel to deal with it. We call such SVM as the linear SVM. There are many kinds of SVMs in which different loss functions are applied. We call these SVMs as L1-SVM and L2-SVM in which L1-loss and L2-loss functions are used respectively. We can also apply SVMs to deal with multi-class classification with one-against-one or one-against-all approaches. In this thesis several models such as logistic regression, L1-SVM, L2-SVM, Crammer and Singer, and maximum entropy will be compared in the multi-class classification task.

口試委員會審定書...........................................i
中文摘要..................................................ii
Abstract.................................................iii
List of tables............................................iv

CHAPTER

I. Introduction..........................................1
II. Models................................................4
2.1 Support Vector Machine................................4
2.2 Crammer and Singer....................................6
2.3 Maximum entropy (ME)..................................7
III. Methods...............................................9
3.1 Trust Region Newton Method (TRON).....................9
3.1.1 Logistic Regression (LR)..........................10
3.1.2 L2-loss Support Vector Machine (L2-SVM)...........11
3.2 Coordinate Descent...................................11
3.2.1 Support Vector Machine Dual with L1-loss and L2-loss
..12
3.2.2 Crammer and Singer................................13
3.2.3 Maximum entropy (ME)..............................14
I.V. Feature of Different Schemes.........................17
V. Experiment...........................................20
5.1 Data Sets...........................................20
5.2 Setting.............................................21
5.3 Comparison..........................................22
V.I. Discussion and Conclusion............................27
BIBLIOGRAPHY..............................................29


A. L. Berger, V. J. Della Pietra, and S. A. Della Pietra. A maximum entropy
approach to natural language processing. Computational Linguistics, 22(1):39–
71, 1996.
B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal mar-
gin classifiers. In Proceedings of the Fifth Annual Workshop on Computational
Learning Theory, pages 144–152. ACM Press, 1992.
L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, L. Jackel, Y. LeCun,
U. Muller, E. Sackinger, P. Simard, and V. Vapnik. Comparison of classifier meth-
ods: a case study in handwriting digit recognition. In International Conference
on Pattern Recognition, pages 77–87. IEEE Computer Society Press, 1994.
K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for large-
scale L2-loss linear SVM. Journal of Machine Learning Research, 9:1369–1398,
2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cdl2.pdf.
C. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273–
297, 1995.
K. Crammer and Y. Singer. On the learnability and design of output codes for
multiclass problems. In Computational Learing Theory, pages 35–46, 2000.
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIB-
LINEAR: A library for large linear classification. Journal of Machine Learn-
ing Research, 9:1871–1874, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/
papers/liblinear.pdf.
C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A
dual coordinate descent method for large-scale linear SVM. In Proceedings of the
Twenty Fifth International Conference on Machine Learning (ICML), 2008. URL
http://www.csie.ntu.edu.tw/~cjlin/papers/cddual.pdf.
C.-W. Hsu and C.-J. Lin. A comparison of methods for multi-class support vector
machines. IEEE Transactions on Neural Networks, 13(2):415–425, 2002.
F.-L. Huang, C.-J. Hsieh, K.-W. Chang, and C.-J. Lin. Iterative scaling and
coordinate descent methods for maximum entropy. In Proceedings of the 47th
Annual Meeting of the Association of Computational Linguistics (ACL), 2009.
Short paper.
T. Joachims. Training linear SVMs in linear time. In Proceedings of the 12th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining,
2006.
D. Jurafsky and J. H. Martin. Speech and Language Processing: An Introduction
to Natural Language Processing, Computational Linguistics and Speech Recogni-
tion. Prentice Hall, second edition, 2008.
S. S. Keerthi, S. Sundararajan, K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. A
sequential dual method for large scale multi-class linear SVMs. In Proceedings of
the 14th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/sdm_
kdd.pdf.
S. Knerr, L. Personnaz, and G. Dreyfus. Single-layer learning revisited: a stepwise
procedure for building and training a neural network. In J. Fogelman, editor, Neu-
rocomputing: Algorithms, Architectures and Applications. Springer-Verlag, 1990.
C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for large-
scale logistic regression. Journal of Machine Learning Research, 9:627–650, 2008.
URL http://www.csie.ntu.edu.tw/~cjlin/papers/logistic.pdf.
E. Mayoraz and E. Alpaydin. Support vector machines for multi-class classifica-
tion. In IWANN (2), pages 833–842, 1999. URL http://citeseer.nj.nec.com/
mayoraz98support.html.
R. Memisevic. Dual optimization of conditional probability models. Technical
report, Department of Computer Science, University of Toronto, 2006.
R. Rifkin and A. Klautau. In defense of one-vs-all classification. Journal of
Machine Learning Research, 5:101–141, 2004. ISSN 1533-7928.
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: primal estimated sub-
gradient solver for SVM. In Proceedings of the Twenty Fourth International Con-
ference on Machine Learning (ICML), 2007.
J. Weston and C. Watkins. Multi-class support vector machines. Technical Report
CSD-TR-98-04, Royal Holloway, 1998.
H.-F. Yu, F.-L. Huang, and C.-J. Lin. Dual coordinate descent methods for
logistic regression and maximum entropy models. Technical report, Depart-
ment of Computer Science, National Taiwan University, March 2010. URL
http://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top