跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.152) 您好!臺灣時間:2025/11/02 12:59
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:魏秀蕙
研究生(外文):Xiu-Hui Wei
論文名稱:MapReduce架構下循序樣式探勘演算法之效能分析
論文名稱(外文):Performance Comparison of Sequential Pattern Mining Algorithms Based on Mapreduce Framework
指導教授:陳世穎陳世穎引用關係陳弘明陳弘明引用關係
指導教授(外文):Shih-Ying ChenHung-Ming Chen
學位類別:碩士
校院名稱:國立臺中科技大學
系所名稱:資訊工程系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:81
中文關鍵詞:資料探勘循序樣式探勘關聯式規則平行運算MapReduce
外文關鍵詞:Data miningsequential patternassociation rulesparallel computingMapReduce
相關次數:
  • 被引用被引用:1
  • 點閱點閱:572
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
由於雲端科技的普及和巨量資料的累積,如何更有效率地縮短時間處理分析大量資料,成為一個重要的研究方向,尤其是大量資料分析所使用的資料探勘技術有很多種,其中也包含了關聯式規則探勘演算法與循序樣式探勘演算法。本研究的目的是透過MapReduce架構平行化設計並分析兩種不同循序樣式探勘演算法之間效能分析,包括循序樣本探勘演算法中的AprioriAll演算法與GSP演算法,以平行運算有效處理大型資料庫並進行兩者的效能分析。實驗結果顯示,GSP平行化演算法較AprioriAll平行化演算法,有較佳的效能。

Because that the popularity of cloud technology and the accumulation of large amounts of data, it is very important direction of research to reduce time for processing large amounts of data efficiently. Besides, there are many kinds of data mining technique which are used in analyzing of huge amounts of data, which contains the association rule mining algorithms and sequential pattern mining algorithms. In this study, two sequential pattern mining algorithms, GSP algorithm and AprioriAll algorithm, are parallelized through the MapReduce framework. Also, we design and study the different efficiency between the two kinds of sequential pattern mining algorithms, and analyze the different efficiency between GSP algorithm and AprioriAll algorithm. The results show that the parallelized GSP algorithm is better than the parallelized AprioriAll algorithm.

中文摘要 i
英文摘要 ii
誌 謝 iii
目 錄 iv
表目錄 vi
圖目錄 viii
一、 緒論 1
1.1 研究背景 1
1.2 研究目的 2
1.3 論文架構 2
二、 相關研究 3
2.1 資料探勘 3
2.2 關聯式規則 4
2.2.1 Apriori演算法 4
2.3 循序樣式探勘 7
2.3.1 AprioriAll演算法 7
2.3.2 GSP演算法 14
2.4 MapReduce 22
三、 方法設計與範例說明 24
3.1 方法設計 24
3.2 平行化AprioriAll 24
3.3 範例說明平行化AprioriAll 33
3.4 平行化GSP 51
3.5 範例說明平行化GSP 59
四、 實驗結果與分析 68
4.1 實驗環境 68
4.2 實驗設計與結果 68
4.2.1 最低支持度效能測試 69
4.2.2 資料庫交易量效能測試 71
4.2.3 交易項目效能測試 74
4.2.4 GSP參數效能測試 76
五、 結論與未來工作方向 78
參考文獻 79

[1]A. Abraham, “Artificial neural networks,” handbook of measuring system design, 2005.
[2]D. Agrawal, S. Das, and A. El Abbadi, “Big data and cloud computing: current state and future opportunities,” In Proceedings of the 14th International Conference on Extending Database Technology , ACM, pp.530-533, March., 2011.
[3]R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets of items in large database,” IN ACM SIGMOD Record, Vol. 22, No. 2, pp.207-216, June., 1993.
[4]R. Agrawal and R. Srikant, “Fast Algorithm for Mining Association Rules in Large Database,” In Proceeding of the 20th International Conference on VLDB, pp.487-499, 1994.
[5]R. Agrawal and R. Srikant, “Mining Sequential Patterns,” In Proc. Conf. Data Engineering (ICDE’95), Taipei, Taiwan, pp.3–145, Mar., 1995.
[6]R. Agrawal and R. Srikant, “Mining sequential patterns: Generalizations and performance improvements,” In 5th Intl. Conf. Extending Database Technology, pp.3-17, March., 1996.
[7]R. Agrawal and R. Srikant, Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB, Vol. 1215, pp. 487-499, September., 1994.
[8]S. Bowers, S. Köhler, B. Ludäscher and D. Zinn, “Parallelizing XML data-streaming workflows via MapReduce,” in Journal of Computer and System Sciences, Vol. 76, pp.447-463, 2010.
[9]E. Y. Chang, H. Li, Y. Wang, and M. Zhang, “Pfp: parallel fp-growth for query recommendation,” in the ACM Conference Series on Recommender Systems, pp.107-114, 2008.
[10]J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” in Communications of the ACM, Vol. 51, Issue 1, pp.107-113, 2008.
[11]E. Ehab, H. Othman and Y. Othman, “An Efficient Implementation of Apriori,” in International Journal of Reviews in Computing, Vol. 12, pp.59-67, 2012.
[12]C. M., Fonseca, &; P. J. Fleming, “Genetic Algorithms for Multiobjective Optimization: FormulationDiscussion and Generalization,” ICGA. Vol. 93, 1993.
[13]R. S. Jagale, K. C. Kulkarmi and S. M. Rokade, “A Survey on Apriori algorithm using MapReduceTechnique,” in International Journal of Emerging Technology and Advanced Engineering, Vol. 3, Issue 4, pp.24-32, 2013.
[14]R. Kohavi, “Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid,” KDD, 1996.
[15]E. Keogh and S. Kasetty, “On the need for time series data mining benchmarks: a survey and empirical demonstration,” Data Mining and knowledge discovery 7.4, pp.349-371, 2003.
[16]C. C. Lee, “Fuzzy logic in control systems: fuzzy logic controller. II,” Systems, Man and Cybernetics, IEEE Transactions on, 20(2), pp.419-435, 1990.
[17]D. L., Schwartz, &; J. B. Black, “Shuttling between depictive models and abstract rules: Induction and fallback,” Cognitive science 20(4), pp.457-497, 1996.
[18]K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The hadoop distributed file system,” Mass Storage Systems and Technologies (MSST), 26th Symposium on.IEEE, 2010.
[19]R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp.267-288, 1996.
[20]T. White, “Hadoop: The Definitive Guide,” O''Reilly Media, 2010.
[21]曾憲雄,蔡秀滿,蘇東興,曾秋蓉和王慶堯著(2005) ,資料探勘Data Mining,台北:旗標出版股份有限公司。

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊