|
A. L. Berger, V. J. Della Pietra, and S. A. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39– 71, 1996. B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal mar- gin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144–152. ACM Press, 1992. L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, L. Jackel, Y. LeCun, U. Muller, E. Sackinger, P. Simard, and V. Vapnik. Comparison of classifier meth- ods: a case study in handwriting digit recognition. In International Conference on Pattern Recognition, pages 77–87. IEEE Computer Society Press, 1994. K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for large- scale L2-loss linear SVM. Journal of Machine Learning Research, 9:1369–1398, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cdl2.pdf. C. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273– 297, 1995. K. Crammer and Y. Singer. On the learnability and design of output codes for multiclass problems. In Computational Learing Theory, pages 35–46, 2000. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIB- LINEAR: A library for large linear classification. Journal of Machine Learn- ing Research, 9:1871–1874, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/ papers/liblinear.pdf. C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the Twenty Fifth International Conference on Machine Learning (ICML), 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cddual.pdf. C.-W. Hsu and C.-J. Lin. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13(2):415–425, 2002. F.-L. Huang, C.-J. Hsieh, K.-W. Chang, and C.-J. Lin. Iterative scaling and coordinate descent methods for maximum entropy. In Proceedings of the 47th Annual Meeting of the Association of Computational Linguistics (ACL), 2009. Short paper. T. Joachims. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006. D. Jurafsky and J. H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recogni- tion. Prentice Hall, second edition, 2008. S. S. Keerthi, S. Sundararajan, K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. A sequential dual method for large scale multi-class linear SVMs. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/sdm_ kdd.pdf. S. Knerr, L. Personnaz, and G. Dreyfus. Single-layer learning revisited: a stepwise procedure for building and training a neural network. In J. Fogelman, editor, Neu- rocomputing: Algorithms, Architectures and Applications. Springer-Verlag, 1990. C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for large- scale logistic regression. Journal of Machine Learning Research, 9:627–650, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/logistic.pdf. E. Mayoraz and E. Alpaydin. Support vector machines for multi-class classifica- tion. In IWANN (2), pages 833–842, 1999. URL http://citeseer.nj.nec.com/ mayoraz98support.html. R. Memisevic. Dual optimization of conditional probability models. Technical report, Department of Computer Science, University of Toronto, 2006. R. Rifkin and A. Klautau. In defense of one-vs-all classification. Journal of Machine Learning Research, 5:101–141, 2004. ISSN 1533-7928. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: primal estimated sub- gradient solver for SVM. In Proceedings of the Twenty Fourth International Con- ference on Machine Learning (ICML), 2007. J. Weston and C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Royal Holloway, 1998. H.-F. Yu, F.-L. Huang, and C.-J. Lin. Dual coordinate descent methods for logistic regression and maximum entropy models. Technical report, Depart- ment of Computer Science, National Taiwan University, March 2010. URL http://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf.
|