(44.192.10.166) 您好!臺灣時間:2021/03/06 04:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:李志賢
研究生(外文):Chih-Hsien Lee
論文名稱:不需封閉檢查的有效探勘高頻封閉式項目集的方法
論文名稱(外文):An Efficient Approach for Mining Frequent Closed Itemsets without Closure Checking
指導教授:楊東麟楊東麟引用關係
指導教授(外文):Don-Lin Yang
學位類別:碩士
校院名稱:逢甲大學
系所名稱:資訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:英文
論文頁數:49
中文關鍵詞:關聯法則資料探勘封閉式項目集封閉檢查
外文關鍵詞:closed itemsetclosure checkingassociation rulesData mining
相關次數:
  • 被引用被引用:0
  • 點閱點閱:91
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
許多已知的高頻封閉式項目集的演算法需要使用子集合測試技術來檢查一個新產生的項目集是否為高頻封閉式項目集,子集合的檢查需要一個儲存的架構存放已知的高頻封閉式項目集或是候選封閉項目集,所以封閉檢查是很花費執行時間而且佔空間的,因此我們提出一個有有效的演算法,稱為不需封閉檢查的封閉式項目集探勘演算法,這個方法記錄FP-tree中的資訊,根據這些資訊可以知道哪些項目一定會組合出不是封閉式項目集,避免使用這個項目來產生項目集,就可以直接產生高頻封閉式項目集,這是一個不需要在項目集產生時檢查是否為封閉的演算法,而且可以直接輸出高頻封閉式項目集,所以我們的方法可以節省檢查的時間又可以節省記憶體空間。此外我們的方法也適用於平行的封閉式項目集探勘,在平行的環境底下,資料探勘的速度將會更加快速。
Most existing algorithms for mining frequent closed itemsets have to check whether a new generated itemset is a frequent closed itemset by using the subset checking technique. If we want to do the subset checking, a storing structure is necessary to be built in order to keep all known frequent itemsets or candidates. It will waste processing time and memory space for closure checking. For this reason, we propose an efficient approach called closed itemset mining without closure-checking (CIMC) algorithm. This method records some information in FP-tree. Then we know which items will constitute unclosed itemsets according to that information. By avoiding using these items to produce the itemsets, it can generate frequent closed itemsets directly. It is not necessary to check whether an itemset is closed or not when it is generated. And we can output frequent closed itemsets directly. Our approach can save checking time and memory space. Moreover, this approach is also suitable for parallel mining of frequent closed itemsets. The process time of data mining will decrease further in the parallel environment.
誌謝 i
中文摘要 ii
Abstract iii
Table of Contents iv
List of Figures v
List of Tables vi
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Problem statement 1
1.3 Research goal 2
1.4 Thesis Organization 3
Chapter 2 Background 4
2.1 Data Mining 4
2.2 Association Rule Mining 5
2.3 Itemset Types 7
2.4 Data Representation 8
Chapter 3 Related Works 9
3.1 FP-growth Algorithm 9
3.2 CLOSET+ Algorithm 10
3.2.1 Algorithm of CLOSET+ 10
3.2.2 An Example for CLOSET+ 12
3.3 FPclose 17
Chapter 4 Proposed Method 19
4.1 Overview 19
4.2 CIMC Algorithm 20
4.2.1 Algorithm of CIMC 20
4.2.2 Main Concept 22
4.2.3 Our Method 22
4.2.4 Parallel Method 26
4.3 A Simple Example 27
Chapter 5 Experimental Results 33
5.1 Environment and Data 33
5.2 Experiment Result and Discussion 34
Chapter 6 Conclusions and Future Work 38
References 39
[1]R. Agrawal, and R. Srikant, "Fast Algorithms for Mining Association Rules," Proceedings of the 20th International Conference on Very Large Data Bases, (VLDB), pp. 487-499, 1994.
[2]J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation," Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1-12, 2000.
[3]D. Lin and Z. M. Kedem, "Pincer-search: an efficient algorithm for discovering the maximum frequent set," IEEE Transactions on Knowledge and Data Engineering, Vol. 14, No. 3, pp. 553-566, 2002.
[4]G. Grahne, and J. Zhu, "Fast Algorithms for Frequent Itemset Mining Using FP-Trees," IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 10, pp. 1347-1362, 2005.
[5]J. Pei, J. Han, and R. Mao, "Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets," ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2000.
[6]J. Wang, J. Han, and J. Pei, "CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets," Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C.
[7]Mohanmmed J. Zaki, and C.-J. Hsiao, "Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure," IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 4, pp. 462-478, 2005.
[8]C. Lucchese, S. Orlando, and R. Perego, "Fast and Memory Efficient Mining of Frequent Closed Itemsets," IEEE Transactions on Knowledge and Data Engineering, Vol.18, No.1, 2006.
[9]C. Liu, H. Lu, X. Yu, W. Wang, and X. Xiao, "AFOPT: An Efficient Implementation of Pattern Growth Approach," Proc. Of IEEE ICDM''03 Workshop FIMI''03, 2003.
[10]L. Ning, N. Wu, and J. Zhang, “A New Technique for Fast Frequent Closed Itemsets Mining,” IEEE International Conference, Volume 4, Page(s):3640 – 3647, Vol. 4, 2005
[11]D. Burdick, M. Calimlim, J. Flannick, J. Gehrke, and T. Yiu, "MAFIA: A Maximal Frequent Itemset Algorithm," IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 11, pp. 1490-1504, 2005.
[12]P.-N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining. Addison Wesley, 2006.
[13]U. Fayyad, G. PiatetskyShapiro and P. Smyth, "The KDD process for extracting useful knowledge from volumes of data," Communications of the ACM, Vol. 39, No.11, pp. 27-34, 1996.
[14]D.-Y. Chiu, Y.-H. Wu, and A.L.P. Chen , “An Efficient Algorithm for Mining Frequent Sequences by a New Strategy without Support Counting,” Proceedings of IEEE Conference on Data Engineering (ICDE''04), pp. 375-386, 2004.
[15]M. Song, and S. Rajasekaran, "A Transaction Mapping Algorithm for Frequent Itemsets Mining," IEEE Trans. on Knowledge and Data Engineering, Vol. 18, No. 4, pp. 472-481, 2006.
[16]M. Seno, and G.. Karypis, "LPMiner: An Algorithm for Finding Frequent Itemsets Using Length-Decreasing Support Constraint," Proceedings of the 2001 IEEE International Conference on Data Mining ICDM ''01, 2001.
[17]IBM Almaden Research Center, "Synthetic Data Generation Code for Associations and Sequential Patterns," URL:http://www.almaden.ibm.com/ software/quest/, 2006.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊
 
系統版面圖檔 系統版面圖檔