跳到主要內容

臺灣博碩士論文加值系統

(44.211.26.178) 您好!臺灣時間:2024/06/16 01:24
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林宜霆
研究生(外文):LIN, YI-TING
論文名稱:高可擴性頻繁樣式探勘演算法之研究
論文名稱(外文):A Study of Scalable Frequent Pattern Mining Algorithm
指導教授:林威成林威成引用關係
指導教授(外文):LIN, WEI-CHENG
口試委員:陳朝鈞楊孟翰王鼎超陳俊豪
口試委員(外文):CHEN, CHAO-CHUNYANG, MENG-HANWANG, DING-CHAUCHEN, CHUN-HAO
口試日期:2020-07-28
學位類別:碩士
校院名稱:國立高雄科技大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:中文
論文頁數:51
中文關鍵詞:資料探勘巨量資料關聯規則挖掘FP-growth
外文關鍵詞:data-miningbig dataAssociation RuleFP-growth
相關次數:
  • 被引用被引用:0
  • 點閱點閱:179
  • 評分評分:
  • 下載下載:28
  • 收藏至我的研究室書目清單書目收藏:0
資料探勘(Data Mining)的領域發展至今已久,最初的目的便是從無法直接觀測的資料中,提取資訊解析成可以被理解的結構然後加以分析利用[15],各種各樣各方面的資料探勘方法也相繼而出,各自以不同的應用目的發展,而在資訊快速發展的現今,資料探勘更是與巨量資料(Big Data) 有著密不可分的關連。

依照過去的統計來看,每一季度資料的量級都會呈現倍率的成長,生活中的資料以及數據不斷的被收集、利用,過去只需處理少量資料的做法也許在現今已經不堪使用了,也因此才會發展出巨量資料探勘。本研究著重探討資料探勘領域中的關聯規則挖掘,針對 FP-growth 演算法在處理巨量資料上的情況作分析,以及提出一個新的方法,以便在巨量資料之中還能夠保有一定程度的效能,並且更有效的利用記憶體。

The field of data mining has been developed for a long time. The original purpose is to extract information from data that cannot be directly observed, analyze it into an understandable structure, and then analyze and use it, a variety of data exploration methods have also come out one after another, each developed with different application purposes, and in the rapid development of information nowadays, data exploration is inseparably related to Big Data.

According to past statistics, the magnitude of the data in each quarter will show a growth rate. The data and data in life are constantly being collected and used. The practice of processing only a small amount of data in the past may be unusable now. Therefore, huge data exploration has been developed. This research focuses on the mining of association rules in the field of data exploration, analyzes the FP-growth algorithm's handling of huge amounts of data, and proposes a new method to maintain a certain degree of data in the huge amount of data. Performance, and more effective use of memory.

第一章 導論 1
1.1研究動機 3
1.2研究貢獻 4
1.3 論文結構 4

第二章 相關研究 5
2.1 FP-Growth 5
2.2 探勘巨量資料關聯規則的相關演算法 8
2.2.1 CFP-Tree and CFP-Array 8
2.2.2. Database Projection 9

第三章 研究方法 11
3.1 問題定義 11
3.2 演算法概述 12
3.2.1 演算法流程 13
3.2.2 分組機制 17

第四章 實驗數據與結果 20
4.1 實驗設計 20
4.2 初步實驗 21
4.2.1 變動參數 N 21
4.2.2 變動參數 I 24
4.2.3 變動參數 T 26
4.2.4 變動參數 D 28
4.3 方法變化實驗 30
4.3.1 給予不同的每組個數 30
4.3.2 給予不同的遞減級數 32
4.4 深入實驗 34
4.4.1 參數變動 T 深入實驗 34
4.4.2 參數變動 D 深入實驗 36
4.5 特有關聯資料庫實驗 38
4.5.1 參數變動 T,三個資料庫相合實驗 38
4.5.2 數個資料庫相合實驗 40
4.5.3 參數變動 I,三個資料庫相合實驗 42
4.5.4 參數變動 N,三個資料庫相合實驗 44
4.5.5 參數變動 T,不同數量資料庫相合實驗 46
4.6 實驗總結 48

第五章 結論 49
第六章 參考文獻 50


[1] Agrawal, R., Imieliński, T., & Swami, A. ,1993, June. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD international conference on Management of data (pp. 207-216).
[2] Agrawal, R., & Srikant, R. ,1994, September. Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB (Vol. 1215, pp. 487-499).
[3] Han, J., Pei, J., & Yin, Y. ,2000, Mining frequent patterns without candidate generation. ACM sigmod record, 29(2), 1-12.
[4] Han, J., Pei, J., Yin, Y., & Mao, R. ,2004,. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data mining and knowledge discovery, 8(1), 53-87.
[5]Benjamin Schlegel, Rainer Gemulla, Wolfgang Lehner, 2011, Memory-Efficient Frequent-Itemset Mining, EDBT/ICDT ‘11 Proceedings of the 14th International Conference on Extending Database Technology pp. 461-472, ACM New York, NY, USA.
[6] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. , 1996, From data mining to knowledge discovery in databases. AI magazine, 17(3), 37-37.
[7] Asif Javed, Ashfaq Khokhar, 2004, Frequent pattern mining on message passing multiprocessor systems, Distributed and Parallel Database 16 (3) pp. 321–334. ACM New York, NY, USA ©2000
[8] Lan Vu, Gita Alaghband, 2013, Novel Parallel Method for Mining Frequent Patterns, Data-Intensive Scalable Computing Systems, pp. 49-54, ACM New York, NY, USA.
[9] Jiayi Zhou, Kun-Ming Yu, 2008, Balanced Tidset-based Parallel FP-tree Algorithm for the Frequent Pattern Mining on Grid System, Fourth International Conference on Semantics, Knowledge and Grid.
[10] Kawuu Weichieng Lin, Yu-Chin Lo, 2013, Efficient algorithms for frequent pattern mining in many-task computing environments, Elsevier Knowledge-Based Systems 49, pp. 10–21.

[11] Juan J Cameron, Alfredo Cuzzocrea, Carson K. Leung, 2013, Stream Mining of Frequent Sets with Limited Memory, AC ‘13 Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 173-175, ACM New York, NY, USA.
[12] Muhaimenul Adnan , Reda Alhajj, 2007, DRFP-tree: disk-resident frequent pattern tree, © Springer Science+Business Media, LLC
[13] Agrawal, Rakesh, and Ramakrishnan Srikant, Quest Synthetic Data Generator, IBM Almaden Research Center, SanJose, California.
[14] Madden, Sam, 2012, From databases to big data ,IEEE Internet Computing 16(3), pp. 4–6.
[15] Chakrabarti, S., Ester, M., Fayyad, U., Gehrke, J., Han, J., Morishita, S., ... & Wang, W. (2006). Data mining curriculum: A proposal (Version 1.0). Intensive Working Group of ACM SIGKDD Curriculum Committee, 140, 1-10.
[16] Piateski, G., & Frawley, W. ,1991, Knowledge discovery in databases. MIT press.
[17] Bramer, M ,2007, Principles of data mining (Vol. 180). London: Springer.
[18] MALVIYA, Jagrati; SINGH, Anju; SINGH, Divakar. An FP Tree based Approach for Extracting Frequent Pattern from Large Database by Applying Parallel and Partition Projection. International Journal of Computer Applications, 2015, 114.18.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊