臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.73) 您好！臺灣時間：2026/07/23 00:55

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

林琬純

研究生(外文):

Lin Wan-chuen

論文名稱:

利用多維度限制式探勘關聯性規則

論文名稱(外文):

Mining Association Rules with Multi-dimensional Constraints

指導教授:

李瑞庭

指導教授(外文):

Anthony J. T. Lee

學位類別:

碩士

校院名稱:

國立臺灣大學

系所名稱:

資訊管理研究所

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2003

畢業學年度:

語文別:

英文

論文頁數:

中文關鍵詞:

關聯性規則、多維度限制式

外文關鍵詞:

Association rule、Multi-dimensional constraint

相關次數:

被引用:0
點閱:261
評分:
下載:16
書目收藏:2

在資料探勘的領域中，關聯性規則探勘是一個很重要的議題，而影響關聯性規則探勘績效的關鍵在於如何有效率地找出頻繁項目集(Frequent Itemset Discovery)。頻繁項目集的挖掘通常會產生大量的頻繁項目集和關聯性規則，但由於使用者有興趣的通常只是其中的一小部份集合而已，就必須找出所有的頻繁項目集之後再做事後的過濾篩選。這樣不但會降低資料探勘的效能，在效率上也會大大損失。以限制式為基礎的資料探勘(Constraint-based Mining)解決了這個問題，在資料探勘過程中加入使用者所指定的限制式，直接找出使用者所需要的部份，以提高資料探勘的效能和效率。
在過去的研究中，通常只用單一屬性值來描述交易中所記錄的項目(Item)。在現實生活裡，使用者通常會想要同時記錄多於一個的項目屬性值，並對多個屬性值下限制式。有鑑於此，在這篇論文裡，我們記錄項目使之具有多個屬性值，並定義了數個多維度限制式。而且，我們也針對多維度限制式的特性做出討論，提出演算法E-CFG和GE-CFG來利用多維度限制式探勘關聯性規則。由實驗結果得知，我們所提出的方法在所有的實驗中皆優於FP-growth+演算法。

Association rule mining is an important issue in the area of data mining. Frequent itemset discovery is the key factor in performance of association rule mining. Frequent itemset mining algorithms often generate a very large number of frequent itemsets and rules, which reduce not only the efficiency but also the effectiveness of the mining algorithms since only the subset of the complete frequent itemsets and association rules is of interest to users, and users need additional post-processing to filter through a large number of mined rules to find the useful ones. Constraint-based mining enables users to concentrate on mining itemsets that are interesting to themselves, which improves the efficiency of mining tasks.
Previously proposed methods consider that items in transactions are characterized only by single attribute value. In the real world, users may want to keep records of items with respect to more than one attribute and impose constraints on multiple dimensional attributes. In this thesis, we enhance the item representation by associating items with a number of attributes, so-called multi-dimensional items. We have defined and characterized some multi-dimensional constraints. Moreover, we have discussed the properties of those constraints and developed algorithms E-CFG and its generic form, GE-CFG, for mining frequent itemsets with multi-dimensional constraints. The experimental results show that both E-CFG and GE-CFG algorithms outperform the FP-growth+ algorithm for all the cases.

Table of Contents............................................i
List of Figures............................................iii
List of Tables..............................................iv
Chapter 1 Introduction......................................1
Chapter 2 Review of Literature..............................4
2.1 Frequent Itemset Discovery............................4
2.1.1 Apriori Algorithm.................................5
2.1.2 FP-growth Algorithm...............................6
2.2 Constraint-based Frequent Itemset Discovery..........10
2.2.1 Constraint Category Definition...................11
2.2.2 Constrained Apriori (CAP)........................12
2.2.3 Constrained FP-growth (CFG)......................14
2.3 Discussion...........................................18
Chapter 3 Research Methodology.............................20
3.1 Data Structure of Multi-dimensional Items............20
3.2 Basic Definition.....................................20
3.3 Multi-dimensional Constraints........................21
3.4 Problem Definition...................................26
3.5 Algorithm............................................27
3.5.1 Overview.........................................27
3.5.1.1 Phase 1: Frequency Check.....................30
3.5.1.2 Phase 2: Constraint Check....................30
3.5.1.3 Phase 3: Conditional FP-tree Construction....30
3.5.2 Algorithms for Case 1............................31
3.5.2.1 Anti-monotone constraint Cam.................31
3.5.2.2 Monotone constraint Cm.......................37
3.5.3 Algorithms for Case 2............................41
3.5.3.1 Anti-monotone constraint Cam.................41
3.5.3.2 Monotone constraint Cm.......................47
3.5.4 Discussion.......................................51
Chapter 4 Experiments and Performance Evaluation...........52
4.1 Synthetic Data Generation and Experiment Parameters..52
4.2 Experiments of E-CFGA and GE-CFGA....................54
4.2.1 Effect of the Constraint Selectivity.............54
4.2.2 Effect of the Number of Transactions.............57
4.2.3 Effect of the Support Threshold..................58
4.3 Experiments of E-CFGM and GE-CFGM....................59
4.3.1 Effect of the Constraint Selectivity.............60
4.3.2 Effect of the Support Threshold..................61
Chapter 5 Conclusions and Future Work......................64
References..................................................66

[1]G. Grahne, L.V.S. Lakshmanan, and X. Wang, Efficient Mining of Constrained Correlated Sets, In Proc. Int. Conf. Data Engineering (ICDE’00), pages 512-521, Feb. 2000.
[2]J-F. Boulicaut and B. Jeudy, Using Constraint for Set Mining: Should We Prune or not? In Proceedings Bases de Données Avançées (BDA’00), pages 221-237, Oct. 2000.
[3]J. Han, G. Dong, and Y. Yin, Efficient Mining of Partial Periodic Patterns in Time Series Database, In Proc. IEEE Int. Conf. Data Engineering (ICDE’99), pages 106-115, March 1999.
[4]J. Han, J. Pei and Y. Yin, Mining Frequent Patterns without Candidate Generation, In Proc. ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’00), pages 1-12, May 2000.
[5]J. Pei and J. Han, Can We Push More Constraints into Frequent Pattern Mining? In Proc. Int. Conf. Knowledge Discovery and Data Mining (KDD’00), pages 350-354, Aug. 2000.
[6]J. Pei, J. Han, and L.V.S. Lakshmanan, Mining Frequent Itemsets with Convertible Constraints, In Proc. IEEE Int. Conf. Data Engineering (ICDE’01), pages 433-442, Feb. 2001.
[7]R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules, In Proc. Int. Conf. Very Large Data Bases (VLDB’94), pages 487-499, Sept. 1994.
[8]R. Agrawal and R. Srikant, Mining Sequential Patterns, In Proc. IEEE Int. Conf. Data Engineering (ICDE’95), pages 3-14, March 1995.
[9]R. Agrawal, T. Imielinski. A. Swami, Mining Association Rules between Sets of Items in Large Databases, In Proc. ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’93), pages 207-216, May 1993.
[10]R. J. Bayardo, Efficiently Mining Long Patterns from Databases, In Proc. ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98), pages 85-93, June 1998.
[11]R. Srikant, Q. Vu, R. Agrawal, Mining Association Rules with Item Constraints, In Proc. Int. Conf. Knowledge Discovery and Data (KDD’97), pages 67-73, 1997.
[12]Raymond T. Ng, L.V.S. Lakshmanan. J. Han, and A. Pang, Exploratory Mining and Pruning Optimizations of Constrained Association Rules, In Proc. ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98), pages 13-24, June 1998.

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	多維度區間資料之跨交易性關聯規則
2.	行動電話通話路徑之樣式探勘
3.	形狀比對：以三角形描繪形狀的輪廓
4.	跨序列關聯性規則資料探勘
5.	在行動環境下對多筆資料查詢之有效率的索引及配置方法
6.	利用時空關係進行視訊檢索之研究
7.	時間序列資料之關聯性規則探勘
8.	以網路服務為基礎之分散式自動協商系統架構
9.	網路廣告理解研究
10.	台灣紡織業設計研發策略之研究─設計研發組織的觀點
11.	尺度轉換技術應用於遙測影像分析之研究
12.	逕流單位歷線參數特性之評估
13.	一個學術文獻瀏覽與搜尋的整合性工具
14.	以使用者偏好分類為基礎之網際資源推薦系統
15.	網路服務提供者提供具端對端頻寬限制與延遲限制之多傳輸率群播服務之路由及頻寬指定演算法

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室