臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.186) 您好！臺灣時間：2026/07/28 16:35

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

董純賢

研究生(外文):

Chun-Hsien Tung

論文名稱:

應用多層次架構之類別優先度與多重分類器改善文件分類準確率

論文名稱(外文):

Adopting the framework of Multi-level Class Priority with Multiple Classifiers to improve the Accuracy of Text Classification

指導教授:

蔣定安

指導教授(外文):

Ding-An Chiang

學位類別:

碩士

校院名稱:

淡江大學

系所名稱:

資訊工程學系碩士在職專班

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2010

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

關聯式分類法、規則排序、規則相依性、多層次類別優先

外文關鍵詞:

Associative Classification、Ranking、Rule Dependency、Multi-level Class Priority

相關次數:

被引用:0
點閱:273
評分:
下載:0
書目收藏:0

一般關聯式分類法（Associative Classification, AC）通常依照準則排序，然而規則與規則間存在著規則相依性（Rule Dependency）的問題，在相同的信賴值、支援值、長度的條件下，規則的執行順序仍然會對分類結果造成影響。
本論文核心針對規則排序問題，除了採用Lazy法則為一般排序原則針對100%信賴值階層進行文件分類外，並刪除分類過文件重新計算信賴值排序，加上採用多層次類別優先度的概念，來探討其對分類效能的影響。利用TFIDF權重及貝氏分類器初次分類後所得之最低類別準確率設為單一靜態門檻值，AC無法分類之文件則以貝氏分類器來分類，以解決關聯式分類器預設類別降低分類準確率的問題。

Regardless that the associative classification (AC) [1][2] method normally ranks the sequence according to the prescribed criteria, yet in terms of the problem of rule dependency that exists between rules, under the identical confidence value, support value and length criteria, the sequence by which the rules are executed can still impact the classification results.
The core of the thesis, focusing on rule ranking problems, entails for more than adopting the Lazy[3] method as the general ranking principle for conducting document classification focusing on 100% confidence level, but also by pruning the classified documents to recalculate the confidence value ranking, together with using a multilevel class priority concept, to examine how it affects the classification performance. The TFIDF[4] weighing and the minimum classification criteria derived from the preliminary classification using the Naïve Bayes[5] classifier are used to define a single still-mode threshold value, and the Naïve Bayes classifier used to classify documents unclassifiable by the associative classification method, aiming to resolve the problem of lowering the classification precision rate due to the preset categories when using the associative classifiers.

目錄
目錄 IV
圖目錄 VI
表目錄 VII
第1章緒論 1
1.1 前言 1
1.2 研究動機與目的 2
1.3 論文架構 6
第2章相關文獻與研究探討 7
2.1 關聯式分類（Associative Classification） 7
2.1.1 預處理（Pre-processing） 12
2.1.2 規則產生（Rule Generation） 12
2.1.3 規則排序（Ranking） 15
2.1.4 刪除規則（Pruning） 16
2.1.5 關聯式分類器（Association Rule Classifier） 19
2.1.6 多重分類器 20
2.2 TFIDF(Term Frequency Inverse Document Frequency) 22
2.3 貝氏分類法（Naïve Bayes） 23
2.4 評量值 25
第3章研究方法 27
3.1 問題探討 27
3.2 門檻值設定與多重分類器 32
3.3 分類流程 34
第4章實驗結果 36
4.1 資料來源 36
4.2 實驗結果 40
4.3 實驗結果分析 44
第5章結論與未來展望 46
5.1 結論 46
5.2 未來展望 47
文獻參考 48
附錄一英文論文 51

圖目錄
圖 2 1　關聯式分類器分類流程示意圖 9
圖 2 2　CBA排序法 15
圖 2 3　Lazy 排序法 16
圖 2 4　database coverage演算法 17
圖 2 5　Lazy演算法 18
圖 3 1　多層次類別優先流程圖 30
圖 3 2　測試分類流程圖 35
圖 4 1　Reuters文件範例 37

表目錄
表 2 1　使用AC結合KNN分類法的多重分類器實驗結果 21
表 2 2　文件數量分佈表 25
表 3 1　貝氏分類器初次分類結果 34
表 4 3　Reuters 21578不同類別的文件數 38
表 4 4　Reuters 21578訓練及測試文件數 39
表 4 3　Lazy針對Reuters21578的分類結果 42
表 4 4　貝氏分類器針對Reuters21578的分類結果 42
表 4 5　針對Reuters21578以多重分類器及單一靜態門檻值之分類結果 43
表 4 6　Reuters 21578最佳實驗結果 44

[1]F. THABTAH, “A review of associative classification mining,” Knowl. Eng. Rev., vol. 22, 2007, pp. 37-65.
[2]Hsin Yuan Chiou, “Improving the performance of Associative Classification by using the Multi-level Class Priority of Rule Ranking,” Master thesis of Tamkang University, Jun. 2010, pp. 1-52.
[3]Mao-Sheng Hung, “Improve Document Classify Accuracy by Rule – Static threshold and Dynamic threshold Research,” Master thesis of Tamkang University, Jun. 2009, pp. 1-49.
[4]T.M. Mitchell, Machine Learning, McGraw-Hill Science/Engineering/Math, 1997.
[5]Y.M. Chen, “Using Association Rule to Improve The Accuracy of Text Categorization - The Combination with other Classifiers,” Master thesis of Tamkang University, Jun. 2009, pp. 1-57.
[6]G. Salton and C. Buckley, Term Weighting Approaches in Automatic Text Retrieval, Cornell University, 1987.
[7]B. Liu, W. Hsu, and Y. Ma, “Integrating Classification and Association Rule Mining,” Knowledge Discovery and Data Mining, 1998, pp. 86, 80.
[8]U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, 1996.
[9]K. Wang, S. Zhou, and Y. He, “Growing decision trees on support-less association rules,” Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, Boston, Massachusetts, United States: ACM, 2000, pp. 265-269.
[10]K. Wang, Y. He, and D.W. Cheung, “Mining confident rules without support requirement,” Proceedings of the tenth international conference on Information and knowledge management, Atlanta, Georgia, USA: ACM, 2001, pp. 89-96.
[11]P.G. Elena Baralis, “A Lazy Approach to Pruning Classification Rules,” Dec. 2002.
[12]W. Li, J. Han, and J. Pei, “CMAR: accurate and efficient classification based on multiple class-association rules,” Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, 2001, pp. 376, 369.
[13]Yongwook yoon, Gary G. Lee, Tseng, “Text Categorization Based on Boosting Association Rules,” Semantic Computing 2008 IEEE International Conference on, 2008, pp. 136-143.
[14]M.F. Porter, “An algorithm for suffix stripping,” Readings in information retrieval, Morgan Kaufmann Publishers Inc., 1997 , pp. 313-316.
[15]Jing Chen, Zhigang Zhang, Qing Li and Xiaoming Li, 2005, “A Pattern-Based Voting Approach for Concept Discovery on the Web,” Web Technologies Research and Development-APWeb 2005, Volume 3399/2005
[16]http://rocling.iis.sinica.edu.tw/CKIP/
[17]Karras, DA, 2006, “An Improved Text Categorization Methodology Based on Second and Third Order Probabilistic Feature Extraction and Neural Network Classifiers,” Lecture Notes in Computer Science, 2006, pp. 9-20.
[18]J.R. Quinlan and R.M. Cameron-jones, “FOIL: A Midterm Report,” IN PROCEEDINGS OF THE EUROPEAN CONFERENCE ON MACHINE LEARNING, vol. 667, 1993, pp. 3--20.
[19]E. Baralis, S. Chiusano, and P. Garza, “On support thresholds in associative classification,” Proceedings of the 2004 ACM symposium on Applied computing, Nicosia, Cyprus: ACM, 2004, pp. 553-558.
[20]R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 20th Int. Conf. Very Large Data Bases, VLDB, J.B. Bocca, M. Jarke, and C. Zaniolo, eds., Morgan Kaufmann, 1994, pp. 487–499.
[21]P. Soucy and G. Mineau, “A simple KNN algorithm for text categorization,” Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, 2001, pp. 647-648.
[22]Y. Yang and X. Liu, “A re-examination of text categorization methods,” Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Berkeley, California, United States: ACM, 1999, pp. 42-49.
[23]T. Joachims, “A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization,” Proceedings of the Fourteenth International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., 1997, pp. 143-151.
[24]P. Bickel and E. Levina, “Some theory for Fisher''s linear discriminant function, `naive Bayes'', and some alternatives when there are many more variables than observations,” Bernoulli, vol. 10, 2004, pp. 1010, 989.
[25]Tseng, Yuen-Hsien, “Effectiveness Issues in Automatic Text Categorization,” Bulletin of the Library Association of China, vol. 68, Jun. 2002, pp. 62-83.
[26]Cho-Ming Lee, “Classifying Chinese Text Documents by Association Rule,” Master thesis of Tamkang University, Jun. 2006, pp. 1-66.

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	利用關聯法則改善文件分類準確度-類別優先問題之探討
2.	關聯式分類演算法結合規則優先權以改善分類之準確度
3.	多層次規則優先度排序對關聯式分類效能影響
4.	利用多層次類別優先度之規則排序以改善關聯式分類效能

無相關期刊

1.	採單一時隙以最大概似估算法作正交多頻分工之載頻追蹤
2.	多層次規則優先度排序對關聯式分類效能影響
3.	在無線感測網路中具電量平衡之異質Relay佈建技術
4.	臺北縣國民小學學年主任之角色知覺與教學領導行為關係之研究
5.	結合生物知識的橢圓排序導引階層分群樹於基因微陣列資料的群集分析
6.	產生景點重要指引道路地圖之研究
7.	具有匿名之RFID安全認證協定
8.	美國死亡率對產業報酬的影響
9.	長壽風險與風險態度對家計單位消費及投資決策之影響
10.	圖森作品中的後現代性
11.	台灣都市成長與區域差距之探討
12.	自由權公約兒童權利之研究－以日本國之實施為中心
13.	中央極限定理應用於偏斜分布時樣本大小之探討
14.	以蒙地卡羅模擬評估各種多重比較法在不同分佈下的表現
15.	文件資料集類別一致性分析工具之實作

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室