跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.42) 您好!臺灣時間:2025/10/01 16:53
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳育德
研究生(外文):Yu-De Chen
論文名稱:新的關聯分類演算法-結合LAC與CMAR及改良權重計算方式
論文名稱(外文):A Novel Associative Classification Algorithm: A Combination Of LAC And CMAR With New Measure Of Weighted Effect Of Each Rule Group
指導教授:郝沛毅郝沛毅引用關係
指導教授(外文):Pei-Yi Hao
學位類別:碩士
校院名稱:國立高雄應用科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:中文
論文頁數:68
中文關鍵詞:關聯分類類別關聯規則CMARLAC
外文關鍵詞:Associative ClassificationClass-Associative RuleCMARLAC
相關次數:
  • 被引用被引用:0
  • 點閱點閱:981
  • 評分評分:
  • 下載下載:20
  • 收藏至我的研究室書目清單書目收藏:1
使用關聯分類於資料採礦已被廣泛的採用,並且也有不錯的表現。一般傳統在關聯規則產生上,透過設定支持度門檻(Support Threshold)來降低需被計算的關聯規則數量,以及過濾可能的干擾(Noise)規則;但是,如此一來也有可能因此失去某些支持度低的重要規則,若單方面的降低支持度門檻,又將產生大量的規則計算以及過多的有害規則,因此支持度門檻之設定對於整體分類結果及執行效率有著深遠的影響存在。

在規則挑選方面,Wenmin Li等人提出的以多重關聯規則分類(Classification based on Multiple Class-Association Rules,CMAR)方法,在此方法下,其規則權重計算方式為最重要之議題,倘若權重計算方式偏差(Bias)於某種情況之下,則會造成分類結果準確度下降。

本研究以CMAR演算法為基,搭配Adriano Veloso等人所提出的懶惰關聯分類(Lazy Associative Classifier)演算法進行罕見規則挖掘,並使用本研究所提出之規則群組權重計算方式,以改善CMAR權重偏差(Bias)問題。本研究以UCI 26個資料集驗證本研究所提出之分類方法。最後,經實驗結果證明本研究所提出之方法可達到相當不錯之分類結果。
The association classification was widely used for data mining, and had good performance. Usually, we need to set a threshold of support value to reduce the number of associative classification rule that required to be computed, and filtering some possible noise rules. However, it may lose some important associative rules with small value of support. If we only decrease the threshold of support value, we will obtain a large number of associative rule and some of them are harmful rules. Therefore, the threshold support value will effect for precision and execution performance significantly.

In rule selecting, Wenmin Li et. al., purposed a classification based on multiple class-association rules (CMAR) approach. In CMAR approach, the computing of rule weight is the most important issues. If the computing of rule weight is biased in some situation, it would decrease the precision of classification.

This paper combines the CMAR and LAC (Lazy Associative Classifier) algorithm to mining the small disjunction rules. Besides, this paper proposes a new rule group weighted computing approach to improve weighted bias problem in CMAR. In experimental part, we use UCI’s 26 dataset. The experiment results demonstrate the proposed algorithm works fairly well and the classification performance has significantly improved.
中文摘要 i
英文摘要 ii
致謝 iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.2.1 罕見規則遺失問題 3
1.2.2 權重計算偏差 4
1.3 研究目的 6
第二章 文獻探討 7
2.1 研究目的 7
2.2 CBA演算法 8
2.2.1 規則產生器 9
2.2.2 分類器建立 10
2.3 CMAR演算法 14
2.3.1 產生規則 15
2.3.1.1 使用FP-Growth產生規則 15
2.3.1.1.1 建構FP-Tree 15
2.3.1.1.2 使用FP-Tree產生規則 17
2.3.1.2 使用CR-Tree儲存規則 18
2.3.1.3 修剪規則 20
2.3.1.3.1 只保留擁有高信賴度的General Rule 20
2.3.1.3.2 只保留正相關規則 20
2.3.1.3.3 以資料集涵蓋來選擇保留的規則 22
2.3.2 基於多重規則的分類方法 25
2.4 LAC演算法 28
2.4.1 資料集過濾 28
2.4.2 快取機制 29
第三章 研究方法 31
3.1 研究架構 31
3.2 資料集前置處理 32
3.3 規則產生階段 33
3.4 分類階段 36
3.5 實驗環境 43
第四章 研究結果 45
4.1 以CBA演算法規則產生限制參數實驗 45
4.2 以CMAR演算法規則產生限制參數實驗 47
4.3 分類正確率分析 49
4.4 權重評估 53
第五章 結論與建議 64
5.1 結論 64
5.2 未來研究與建議 66
參考文獻 67
1.Bing Liu, Wynne Hsu, and Yiming Ma, 1998, “Integrating Classification and Association Rule Mining”, Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), New York, USA, pp. 80-86. (The CBA system can be downloaded from http://www.comp.nus.edu.sg/~dm2).
2.Adriano Veloso, Wagner Meira Jr., and Mohammed J. Zaki, 2006, “Lazy Associative Classification”, Proceedings of the Sixth International Conference on Data Mining, Washington, USA, pp. 645-654.
3.Baralis, E., Chiusano, S., and Garza, P., 2008, “A Lazy Approach to Associative Classification”, Knowledge and Data Engineering, IEEE Transactions on, Vol. 20 Issue.2, pp. 156-171, Feb.
4.W. Li, J Han, and J. Pei, 2001, “CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules”, Proc. IEEE Int’l Conf. Data Mining (ICDM ’01), California, USA, pp. 369, Nov.
5.Quinlan, J. R., 1992, C4.5: program for machine learning, Morgan Kaufmann.
6.Rakesh Agrawal, Tomasz Imieliński, and Arun Swami, 1993, “Mining Association Rules Between Sets of Items in Large Databases”, ACM SIGMOD Conference, New York, USA, pp. 207-216.
7.Rakesh Agrawal, and Ramakrishnan Srikant, 1994, “Fast algorithms for mining association rules in large databases”, Proceedings of the 20th International Conference on Very Large Data Bases(VLDB), Santiago, Chile, pp. 487-499, September.
8.Gary M. Weiss, Haym Hirsh, 1998, “The Problem with Noise and Small Disjuncts”, In Proceedings of the Fifteenth International Conference on Machine Learning, CA, USA, pp. 574-578.
9.P. Domingos, 1997, “Context-sensitive feature selection for lazy learners”, Artificial Intelligence Review, vol. 11, pp. 227-253.
10.R. Holte, L. Acker, and B. Porter., 1989, “Concept learning and the problem of small disjuncts”, In Proc. of the Intl. Joint Conf. on Artificial Intelligence, pp 813-818.
11.H. Ishibuchi, T. Nakashima and T. Yamamot, 2001, “Fuzzy association rules for handling continuous attributes”, Proceedings. IEEE International Symposium on, Pusan, South Korea, pp. 118-121.
12.W. Li, 2001, Classification based on multiple association rules, Simon Fraser University, M.Sc. Thesis.
13.J. Han, J. Pei, and Y. Yin, 2000, “Mining frequent patterns without candidate generation”, In SIGMOD'00, Dallas, TX, pp. 1-12, May.
14.U.M. Fayyad, K.B. Irani, 1993, “Multi interval discretization of continuous-valued attributes for classification learning”, In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1022-1027.
15.C.L. Blake, C.J. Merz, 1998, UCI Repository of machine learning databases, http://www.cs.uci.edu/~mlearn/MLRepository.html.
16.Adriano Veloso et al., 2007, “Multi-label Lazy Associative Classification” Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, pp. 605-612.
17.Elena Baralis, Silvia Chiusano, and Paolo Garza, 2004, “On Support Thresholds in Associative Classification”, Proceedings of the 2004 ACM symposium on Applied computing, Nicosia, Cyprus, pp. 553-558.
18.S. Brin, R. Motwani, and C. Silverstein, 1997, “Beyond market baskets: Generalizing associations rules to correlations”, In ACM SIGMOD International Conference on Management of Data 1997, Arizona, USA, pp. 265-276, May.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top