跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.86) 您好!臺灣時間:2025/02/15 08:10
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林卓彥
研究生(外文):Cuo-Yen Lin
論文名稱:自動分類方法之比較
論文名稱(外文):Comparison of Automatic Classification Methods
指導教授:吳昇吳昇引用關係
指導教授(外文):Sun Wu
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:中文
論文頁數:36
中文關鍵詞:自動分類
外文關鍵詞:KNNSVMClassification
相關次數:
  • 被引用被引用:9
  • 點閱點閱:508
  • 評分評分:
  • 下載下載:108
  • 收藏至我的研究室書目清單書目收藏:7
隨著網際網路的快速發展,加上寬頻網路的日趨普及,網路上流傳的資訊是越來越多樣化,但是目前網路上的資訊大多缺少自動化文件分類,常常需要人工手動地去對文件分類及挑選的工作。所以,在本篇論文中,主要實作目前為主流的分類方法,來探討這些分類方法的效能及結果。

在本論文中,我們決定使用KNN和SVM這兩種自動分類的方法。會先收集大量的新聞網頁做分析,以Vector Space Model為主要架構,對不同的網頁做切割產生輸入檔。將產生的輸入檔傳給KNN和SVM這兩種方法去測試,最後由 precision 來探論這兩種方法的優缺及差異。
With the rapid development of internet and the popularity of online internet, we can surf various information what we needed on the internet. But so far the information on the line lack automobile system to categorize them, it needs to categorize them by artificialities. In this thesis, I try to analyze these main methods and test categorization. I want to present they are effective and view its result.

In this thesis, we try to decide two comparisons of automatic classification methods. One is KNN, another is SVM. At first, I will collect lots of data about news websites to analyze. I chose Vector Space Model as the main structure. Then, I try to parser different pages to test KNN and SVM these two methods,Finally, precision will compare and contrast the difference , the advantage and disadvantage.
摘 要 II
ABSTRUCT III
目 錄 IV
圖表目錄 VI
1 緒論 1
1.1 簡介 1
1.2 研究動機 2
1.3 論文組織 2
2 文獻探討 3
2.1 文件自動分類 3
2.2 特徵選取 3
2.3 分類規則 4
2.4 分類方法的選擇 4
3 系統簡介 6
3.1 系統架構 6
3.2 Document Pre-Processing 7
4 分類方法 9
4.1 Feature Selection 9
4.2 K-Nearest Neighbor 11
4.3 Weight K-Nearest Neighbors 13
4.4 Support Vector Machine 14
5 實驗結果與分析 18
5.1 實驗資料 18
5.2 KNN與WKNN的實驗結果 19
5.3 SVM的實驗結果 24
5.4 實驗分析 25
6 結論 28
6.1 Future Work 28
6.1.1 其它語系 28
6.1.2 重覆分類 28
6.1.3 階層式分類 29
6.1.4 Over-Fitting Problem 29
6.2 結論 30
7 參考資料 32
[1] N. Cristitanini , J. Shawe-Taylor, “An Introduction to Support Vector Machines”, Cambridge University Press, 2000
[2] Fabrizio S., “Machine Learining in Automated Text Categorization”, ACM Computing Surveys, 2002.
[3] Li B., Yu S. Lu Q., “ An Improved K-Nearest Neighbor Algorithm for Text Categorization”, Proceedings of the 20th International Conference on Computer Processing of Oriental Languages, 2003.
[4] Ingo S., “Consistency of support vector machines and other regularized kernel classifiers”, 2003.
[5] T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, Proceedings of the European Conference on Machine Learning Springer, 1998
[6] T. Joachims, “Learning to Classify Text Using Support Vector Machines”, Dissertation Kluwer, 2002.
[7] J. Platt, “Fast Training of Support Vector Machines using Sequential Minimal Optimization”, MIT Press, 1998.
[8] J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, Morgan Kaufmann Publisher, 2000.
[9] Ian H. Written and E. Frank, “Data Mining: Practical Machine Learning Tools and Techniques, Second Edition”, Morgan Kaufmann, 2005
[10] G. Salton. Associative document retrieval techniques using bibliographic information. Journal of the ACM, October, 1963
[11] G. Salton and M.E. Lesk. Computer evaluation of indexing and text processing. Journal of the ACM, January, 1968.
[12] Chidanand Apt. Fred Damerau and Sholom M. Weiss, “Towards Language Independent Automated Learning of Text Categorization Models”, Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in information Retrieval, 1994, Pages 23-30
[13] William W. Cohen and Yoram Singer, “Context-Sensitive Learning Methods for Text Categorization”, Proceedings of the 19th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 1996, Pages, 307-315
[14] Yiming Yang and Xin Lin, “A Re-Examination of Text Categorization Methods”, Proceedings of the 22nd Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval ,1999, Pages 42-29
[15] Vapnik, V.N., “The Nature of Statistical Learning Theory”. Springer-Verlag, New York, NY, 2000
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top