跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.13) 您好!臺灣時間:2025/11/24 06:20
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王星凱
研究生(外文):Hsing-Kai Wang
論文名稱:有效率的分散式關聯規則探勘系統
論文名稱(外文):An Efficient Distributed Association Rules Mining System
指導教授:張昭憲張昭憲引用關係
指導教授(外文):Jau-Shien Chang
學位類別:碩士
校院名稱:淡江大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:中文
論文頁數:35
中文關鍵詞:關聯規則探勘分散式系統資料探勘資料庫
外文關鍵詞:Association Rules MiningDistributed SystemData MiningDatabase
相關次數:
  • 被引用被引用:0
  • 點閱點閱:217
  • 評分評分:
  • 下載下載:29
  • 收藏至我的研究室書目清單書目收藏:3
關聯規則探勘(Association Rule Mining)可從交易資料庫中找出”A->B”型態的簡明規則(如果購買A也會購買B)。利用這項技術,企業可歸納出顧客的消費習慣,進而發展合適的行銷策略。然而,面對日益龐大的交易資料庫,為了加快探勘速度,如何利用多部電腦進行分散式探勘便引起學者們的廣泛注意。
本研究針對大型交易資料庫的關聯規則探勘,發展了一套有效率的分散式關聯規則探勘系統- EDAMS(an Efficient Distributed Association rules Mining System)。由於分散式探勘的效能瓶頸通常發生在節點間探勘結果之整合,因此我們捨棄傳統點對點的資料交換方式[3][7][9],將特定節點改為資料伺服器(只負責資料整合與分發,不從事探勘工作),有效地將傳訊次數由O(n2)大幅縮減至O(n)。此外,本研究採用DHP[2]做為基礎演算法,充分利用其在二階項目集的良好縮減能力,進一步降低總體資料傳輸量。為驗證系統有效性,我們使用八部電腦針對十萬筆至七十萬筆的交易資料進行探勘。由實驗數據可知,當資料筆數增加時,整體的加速比率(speedup ratio)也逐步提昇,顯示本系統的良好特性。此外,在相同資料筆數與支持度之下,EDAMS的加速比率也優於之前的相關研究[7][9],驗證了犧牲一節點做為資料伺服器以改善傳訊次數之可行性。
Association rule mining can help the enterprises to capture the consumer behaviors and develop effective marketing strategies. However, the size of transaction database is increasing everyday, how to get timely mining results becomes a serious problem. In this paper, we propose an Effective Distributed Association rule Mining System, EDAMS, to cope with this problem. Unlike other distributed mining systems, a dedicated node is used as data server to collect exchange data among nodes. Thus, the point-to-point broadcasts are avoided and therefore the number of message exchanged is greatly reduced from O(n2) to O(n). Besides, to reduce the total amount of message, the DHP algorithm[2] is used as the basis algorithm to reduce the number of candidate 2-itemsets. According to our experimental results, the EDAMS achieve steadily increasing speedup ration ranging from 100,000 to 700,000 transaction data. Also, the speedup ratio is superior to those in the previous work[7][9]. It clearly demonstrates the effectiveness of our system.
第一章 緒論 1
第一節 研究背景與動機. 1
第二節 論文章節架構 3
第二章 相關研究介紹與評析 4
第一節 單機版資料探勘演算法 4
第二節 分散式版資料探勘演算法 8
第三節 演算法評析 10
第三章 EDAMS系統 13
第一節 系統模組 13
第二節 系統運作模式 14
第三節 節點間的傳訊次數與資料量大小 16
第四節 EDAMS 演算法 22
第四章 實驗結果 29
第一節 分散式實驗環境的建構 29
第二節 實驗數據 30
第五章 結論 33
參考文獻 34
[1] Agrawal and R. Srikant, “Fast algorithms for mining associations rules”, Proceedings of the 20th International Conference on Very Large Data Base, 1994.
[2] Jong Soo Park , Ming-Syan Chen and Philip S. Yu ,“An effective hash-based algorithm for mining association rules,” Proceedings of the 1995 ACM SIGMOD international conference on Management of Data, May 1995, pp. 175-186.
[3] Cheung, D.W., Ng, V.T.; Fu,A.W., Yongjian Fu,“ Efficient Mining of Association Rules in Distributed Databases,” IEEE Transactions on Knowledge and Data Engineering,Vol. 8,No. 6,Dec 1996.
[4] Adomavicius, G., Tuzhilin, A., “ Using data mining methods to build customer profiles,” IEEE Computer , Volume: 34 Issue: 2 , Feb 2001 ,pp74-82.
[5] Aggarwal, C.C., Yu, P.S., “A new approach to online generation of association rules,”, IEEE Transactions on Knowledge and Data Engineering, Volume: 13 Issue: 4 , Jul/Aug 2001, pp527-540.
[6] Zaki, M.J; “Parallel and distributed association mining: a survey,” IEEE Concurrency, Vol. 7 Issue 4 , Oct-Dec 1999, pp14-25.
[7] R. Agrawal and J.C. Shafer, “Parallel Mining of Association Rules: Design, Implementation, and Experience, “ IBM Research Report RJ1004, 1996.
[8] Amitabha Das , Wee-Keong Ng , Yew-Kwong Woon, “Rapid association rule mining,” Proceedings of the tenth international conference on Information and knowledge management October 2001, pp. 474-481.
[9] D. W. Cheung, J. Han, V. T. Ng, A. W. Fu, and Y. Fu.” A fast distributed algorithms for mining association rules” In Proceedings of IEEE 4th International Conference on Parallel and Distributed Information Systems, pages 31--42, December 1996.
[10] Z. Chen, “Data Mining and Uncertain Reasoning,” John Wiley & Sons, Inc., 2001.
[11] R. Srikant and R. Agrawal, “Mining Quantitative Association Rules in Large Relational Tables,” Proceedings of the 1995 ACM SIGMOD international conference on Management of data, 1996, pp. 1-12.
[12] S. Mitra, et. al., “Data Mining in Soft Computing Framework: A Survey,” IEEE Trans. on Neural Networks, Vol. 13, No. 1, Jan. 2002, pp. 3-14.
[13] S. Pal, et. al., “Web Mining in Soft Computing Framework: Relevance, State of the Art and Future Directions,” IEEE Trans. on Neural Networks, Vol. 13, No. 5, Sep. 2002, pp. 1163-1177.
[14] D. W. Cheung, S. D. Lee, and y. Xiao, “Effect of Data Skewness and Workload Balance in Parallel Data Mining,” IEEE Trans. on Knowledge and Data Engineering, Vol. 14, No. 3, May/June 2002.
[15] http://www.remoteanything.com.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊