跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.169) 您好!臺灣時間:2025/01/25 08:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:郭斯彥
研究生(外文):Sy-Yen Kuo
論文名稱:基於學習分類結果的多階層樹建構法
論文名稱(外文):Hierarchical Tree Construction based onLearning from Classification Results
指導教授:李新林李新林引用關係
指導教授(外文):Sing-ling Lee
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:英文
論文頁數:46
中文關鍵詞:建構學習多階層分類
外文關鍵詞:ClassificationLearningHierarchicalConstruction
相關次數:
  • 被引用被引用:0
  • 點閱點閱:336
  • 評分評分:
  • 下載下載:2
  • 收藏至我的研究室書目清單書目收藏:0
對於資料分類而言,多階層是直覺且有效的架構。我們設計一個系統基於多階層模組自動分類文字文件到一個類別上。我們提出一個學習分類的結果來自動建構多階層的分類法。把容易分類到彼此的類別們合併,因此分錯的文件會在多階層的上層先分到正確的支線,而且這些類別的相異處也會在多階層的下層清楚地被分辨出來。在真實資料的實驗結果描述出基於我們的方法來調整多階層分類法時在整體正確率中最好的表現有9%的改善。
Hierarchies are intuitive and e cient framework for data classi cation. We design a system for automatically classifying text documents into unit categories based on hierarchical models. We propose an automatically hierarchical taxonomy construction method based on learning from classi cation results. The classes which have the tendency to be incorrectly classi ed to each other are merged, so that the incorrectly classi ed documents are partitioned to the right branch at high level of the hierarchy and the di erence between these classes can be distinguished clearly at low level of the hierarchy. The experimental results on real-world data sets report that the best performance on total accuracy is improved about 9% while adjusting the hierarchical taxonomy based on our proposed method.
1 Introduction 1
2 Related Works 4
2.1 Vector Space Model (VSM)
2.2 Support Vector Machine (SVM)
2.3 Hierarchical Classi cation
2.4 Hierarchical Taxonomy Construction
2.4.1 Clustering Approach
2.4.2 Automatic Construction
3 Hierarchy Construction based on Learning from Classi cation Results
3.1 SVM Classi ers Training
3.2 Hierarchy Adjusting
3.2.1 Biases between Classes
3.2.2 Feature Distribution
3.2.3 Training Data Size
3.2.4 Algorithm
4 Experiments
4.1 Performance Testing
4.2 Data Set
4.3 Experimental Results
5 Conclusions
[1] Andrew Moore, "Statistical Data Mining Tutorials",
http://www.autonlab.org/tutorials/
[2] C. Cortes and V. Vapnik. "The Nature of Statistical Learning Theory",Springer, 1995.
[3] Hastie Trevor, Tibshirani Robert, and Friedman Jerome. "Hierarchical clustering". The Elements of Statistical Learning, New York, Springer, Vol. 14.3.12, pages 472-479, 2001.
[4] SVMlight, http://svmlight.joachims.org/
[5] J. Z. Liang, "SVM Multi-classi er and web document classi cation", Proceeding of International Conference on Machine Learning and Cybernetics 2004 (ICMLC''04).
[6] J.Q. Zou, G.L. Chen, and W.Z. Guo, "Chinese Web Page Classi cation Using Noise-tolerant support vector Machines", IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2005 (NLP-KE''05).
[7] J. D. M. Rennie and R. Rifkin, "Improving Multi-class Text Classi cation with the Support Vector Machine", Massachusetts Institute of Technology, Tech. Rep AIM-2001-026.2001, MIT, 2001.
[8] J.A. Hartigan. "Clustering Algorithms", Wiley, 1975.
[9] Kunal Punera, Suju Rajan, and Joydeep Ghosh. "Automatically learning document taxonomies for hierarchical classi cation", WWW: Special interest tracks and posters of the 14th international conference on World Wide Web, pages 1010-1011, 2005.
[10] Lei Tang, Jianping Zhang, and Huan Liu. "Acclimatizing Taxonomic Semantics for Hierarchical Content Classi cation", Knowledge Discovery and Data Mining Conference, August 20-23, 2006, Philadelphia, Pennsylvania, USA (KDD 2006).
[11] Kunal Punera, Suju Rajan, and Joydeep Ghosh. Automatic Construction of N-ary Tree Based Taxonomies", Sixth IEEE International Conference on Data Mining Workshops ICDMW''06).
[12] Susan Dumais and Hao Chen. "Hierarchical classi cation of web content", SIGIR, 2000.
[13] Ke Wang, Senqiang Zhou, and Shiang Chen Liew. "Building hierarchical classi ers using class proximity", In Proc. of the 25th International Conference on Very Large Data Bases Conference, pages 363-374, 1999 (VLDB''99).
[14] Daphne Koller and Mehran Sahami. "Hierarchically classifying documents using very few words", International Conference on Machine Learning, pages 170-178, 1997 (ICML''97).
[15] Tie-Yan Liu, Yiming Yang, Hao Wan, Hua-Jun Zeng, Zheng Chen, and Wei-Ying Ma. "Support vector machines classi cation with a very largescale taxonomy", ACM Special Interest Group on Knowledge Discovery and Data Mining Explor. Newsl., 2005 (ACM SIGKDD''05).
[16] I. S. Dhillon, S. Mallela, and R. Kumar. "Enhanced word clustering for hierarchical text classi cation", Knowledge Discovery and Data Mining Conference, pages 191-200, 2002 (KDD''02).
[17] D. Lewis. Reuters-21578 Text Categorization Test Collection, Distribution 1.0, Manuscript, 1997, http://www.daviddlewis.com/resources/testcollections/reuters21578.
[18] G. Salton, C. Buckley. "Term Weighting Approaches in Automatic Text Retrieval", Information Processing and Management, 1988.
[19] Data mining from Wikipedia, http://en.wikipedia.org/wiki/Data mining
[20] Wikipedia, http://www.wikipedia.org/
[21] Lijuan Cai and Thomas Hofmann. "Hierarchical document categorization with support vector machines". Conference on Information and Knowledge Management, pages 78V87, 2004 (CIKM''04).
[22] A.K. Jain and R.C. Dubes. "Algorithms for Clustering Data", Prentice Hall, Englewood Cli s, NJ, 1988.
[23] Aghagolzadeh M., Soltanian-Zadeh H., Araabi B., Aghagolzadeh A. "A Hierarchical Clustering Based on Mutual Information Maximization", IEEE International Conference on Image Processing, 2007 (ICIP''07).
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top