6.1 中文部分
[1]林延璉,2011年,“使用句構分析模型與向量支持機的自動文件分類架構”,碩士論文,國立高雄第一科技大學資訊管理研究所。[2]曾元顯,2002年,“文件主題自動分類成效因素探討”,中國圖書館學會會報,68期,頁62-83。[3]蔡純純,2002年,“中文新聞文件空間資訊擷取之研究─以火災、搶劫、車禍事件為例”,碩士論文,台灣大學地理環境資源學系。[4]鄭為倫,王台平,2005年,“運用特徵詞權重改善文件自動分類之成效-以貝氏分類器為例”,第一屆資訊管理學術暨專案管理實務研討會,論文編號IMPM-E18。
6.2 英文部分
[5]Al-Kofahi, K., Tyrrell, A., Travers, A. V. T., and Jackson P.,“Combining Multiple Classifiers for Text Categorization”, Proceedings of the Tenth International Conference on Information and Knowledge Management, 2001, pp. 97-104.
[6]Azzopardi L., Porter Stemming with C#.Net, Retrieved June 18, 2012, from Univerity of Paisley, Scotland Web Site: http://tartarus.org/~martin/PorterStemmer/
[7]Bekkerman, R., El-Yaniv, R., Winter, Y., and Tishby, N.,“On Feature Distributional Clustering for Text Categorization”, Proceedings of the Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 2001, pp.146-153.
[8]Bergsma, S., Pitler E. and Lin D.,“Creating Robust Supervised Classifiers via Web-Scale N-gram Data”, Proceeding ACL ''10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010, pp. 865-874.
[9]Bergsma, S.,Lin D. and Goebel R.,“Web-Scale N-gram Models for Lexical Disambiguation”, Proceeding IJCAI''09 Proceedings of the 21st international jont conference on Artifical intelligence, 2009, pp. 1507-1512.
[10]Brants T., Franz A., Google N-gram Corpus, Retrieved June 18, 2012, from Web Site: http://books.google.com/ngrams/datasets
[11]Chen, C L., Tseng, S.C. and Liang, T.,“Mining fuzzy frequent itemsets for hierarchical document clustering”, Information Processing and Management, 2010, vol.46, pp. 193-211.
[12]Church, K. W. and Hanks, P.,“Word association norms, mutual information and lexicography”, Computational Linguistics, 1990, 16(1), pp. 22-29.
[13]Debole, F., and Sebastiani, F., “Supervised term weighting for automated text categorization”, Proceedings of the 2003 ACM symposium on applied computing, 2003, pp. 784-788.
[14]Elisseeff, A. and Weston, J., “A kernel method for multi-labelled classification”, Advances in Neural Information Processing Systems, 2002, pp. 681–687.
[15]Ferreira, A. and Figueiredo, M.,“An unsupervised approach to feature discretization and selection”, Pattern Recognition, 2012, 45(9), pp.3048-3060.
[16]Ferreira, A. and Figueiredo, M.,“Efficient Unsupervised Feature Selection for Sparse Data”, International Conference on Computer as a Tool (EUROCON), 2011 IEEE.
[17]Fung, B. C. M., Wang, K. and Ester, M.,“Hierarchical document clustering using frequent itemsets”, Proceedings of the 3th SIAM, 2003, pp. 59-70.
[18]Guan, H., Zhou, J., and Guo, M.,“A class-feature-centroid classifier for text categorization”, Proceedings of the 18th international conference on World wide web, 2009, pp. 201-210.
[19]Hughes, T., and Ramage, D.,“Lexical semantic relatedness with random grapg walks”, Proceeding of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007.
[20]Jiang, M., Jensen, E., Beitzel, S., and Argamon, S.,“Choosing the right bigrams for information retrieval”, Proceedings of the 2004 Meeting of the International Federation of Classification Societies , 2004, Chicago, IL.
[21]Joachims, T.,“Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, Proceedings of the European Conference on Machine Learning, 1998, pp. 137-142.
[22]Leopold, E., and Kindermann, J.,“Text categorization with support, vector machines – How to represent texts in input space”, Machine Learning, 2002, vol. 46, pp. 423–444.
[23]Liu, Y., Loh, H. T., and Sun, A,“Imbalanced text classification: A term weighting approach”, Expert Systems with Applications, 2009, vol. 36, pp. 690-701.
[24]Peat, H. J., and Willett, P.,“The limitation of term co-occurrence data for query expansion in document retrieval systems”, Journal of the American Society for Information Science, 1991, 42(5), pp.379-380.
[25]Salton, G. and Buckley C., Stop Word List 2, Retrieved June 18, 2012, from Cornell University, Experimental SMART Information Retrieval System Web Site: http://www.lextek.com/manuals/onix/stopwords2.html
[26]Saracoğlu, R., Tutuncu, K. and Allahverdi, N.,“A new approach on search for similar documents with multiple categories using fuzzy clustering”, Expert Systems with Applications, 34(4), 2008, pp. 2545–2554.
[27]Steinbach, M., Karypis, G. and Kumar, V.,“A comparison of document clustering techniques”, Workshop on Text Mining, 2000, pp. 109-111.
[28]Tandon, N., and Melo, G. D.,“ Information Extraction from Web-Scale N-Gram Data”, SIGIR 2010 WEB N-GRAM Workshop, 2010.
[29]Wang, K., Thrasher, C., Viegas, E., Li, X., and Hsu, P.,“An Overview of Microsoft Web N-gram Corpus and Applications”, Proceedings of the NAACL HLT 2010: Demonstration Session, 2010, pp. 45-48.
[30]Yang, Y., and Pedersen, J. O.,“A Comparative Study on Feature Selection in Text Categorization”, Proceedings of the International Conference on Machine Learning, 1997, pp.412-420.
[31]Yu, L. C., Wu, C. H., Philpot, A. and Hovy, E.,“OntoNotes : Sense Pool Verification Using Google N-gram and Statistical Tests”, Linguistic Data Consortium (LDC), 2007.
[32]Zhang, M. L., “ML-RBF: RBF neural networks for multi-label learning,” Neural Processing Letters, 2009, 29(2), pp. 61–74.