(3.238.206.122) 您好!臺灣時間:2021/04/21 09:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:曾祥益
研究生(外文):Hsiang-Yi Tseng
論文名稱:即時探勘頻繁項目集之階段樹刪減與分歧處理候選項目集研究與應用
論文名稱(外文):A Gradational Tree Pruning and Candidate Itemset Differentiating Algorithm for Real-Time Frequent Pattern Mining and Applications
指導教授:黃仁鵬黃仁鵬引用關係
指導教授(外文):Jen-Peng Huang
學位類別:碩士
校院名稱:南台科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:103
中文關鍵詞:探勘即時頻繁項目集階段拆解資料探勘
外文關鍵詞:real-time frequent pattern miningstep decompositiondata mining
相關次數:
  • 被引用被引用:0
  • 點閱點閱:104
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:0
  隨著交易、文件、日常處理資料的電子化、各種型式的資料被大量的累積下來,也隨著資訊科技的進步,資料探勘的技術變得日益重要,並且已經廣泛的應用在商業上的預測以及決策的支援。而頻繁項目集的探勘在資料探勘的領域中也扮演相當重要的地位。
  本研究提出一個探勘即時頻繁項目集 (Real-time frequent pattern mining)的演算法GDP (Gradationally Differentiating Process),GDP演算法主要的目的是利用本研究新提出的階段樹刪減處理及階段分歧處理機制,擷取FP-Growth演算法及GRA演算法中的優點,建立一套能有效探勘即時頻繁項目集的演算法。GDP演算法中有三個主要機制用以提高資料探勘的效能。
  GDP演算法的第一個機制為提出新的拆解方法可避免重覆拆解類似的交易記錄。接下來,GDP演算法的第二個機制為提出新的樹刪減處理,能在每階段開始時利用上一個階段所產生的頻繁項目集有效刪減GDP-Tree的節點。GDP演算法的第三個機制為提出新的分歧處理的方法,它可以在階段探勘的進行過程中可在每階段尚未結束前,即輸出已通過最小支持度的頻繁項目集,並且判定已不可能再出現頻繁項目集時提早結束該階段的探勘動作。進一步的加速每一階段的探勘速度。
  With the growth of electronic information such as transaction records, documents and etc, various data accumulated rapidly and hugely. By the development of information technology, data mining become more and more important for predictions and decisions making in various commercial purposes, and associate rules mining is one of the most important technologies in data mining.
  In this thesis, we propose a real-time frequent patterns mining algorithm called GDP(Gradationally Differentiating Process). This algorithm is combined the advantages of FP-Growth and GRA, so it can mining frequent patterns in real time efficiently. There are three mechanisms to speed up the performance.
  The first mechanism we propose is a new decomposition method which can avoid decomposing the similar records. Next, the second mechanism is a new prune method which can prune nodes of GDP-Tree produced according to the result of previous mining process. And the third mechanism is a differentiating process method. It can outputs the frequent itemsets immediately before the end of mining process and detects whether there is any frequent itemset and terminates the mining process early, so we can speed up the performance in every mining step.
摘  要
ABSTRACT
誌  謝
目  次
表目錄
圖目錄
第一章 緒論
 1.1 研究背景
 1.2 研究動機與目的
 1.3 研究流程
 1.4 論文架構
第二章 文獻探討
 2.1 資料探勘
  2.1.1 資料探勘的定義
  2.1.2 知識探索與資料探勘
 2.2 關聯規則
  2.2.1 Apriori演算法
  2.2.2 DHP演算法
  2.2.3 Partition演算法
  2.2.4 Sampling演算法
  2.2.5 DIC演算法
  2.2.6 FP-Growth演算法
  2.2.7 ICI演算法
  2.2.8 QSD演算法
  2.2.9 IDA演算法
  2.2.10 GDA演算法
  2.2.11 EFI演算法
  2.2.12 GRA演算法
 2.3 即時頻繁項目集
  2.3.1 BDFS(b)演算法
第三章 研究方法
 3.1 GDP演算法概念說明
 3.2 階段樹刪減處理
  3.2.1 GDP-Tree
  3.2.2 相關定義
  3.2.3 樹節點刪除
  3.2.4 過濾階段頻繁項目集
  3.2.5 階段路徑刪減
  3.2.6 階段樹刪減處理流程
 3.3 階段分歧處理
  3.3.1 相關定義
  3.3.2 候選項目集的處理
  3.3.3 資料瀏覽與處理過程
  3.3.4 階段分歧處理步驟說明
 3.4 GDP演算法流程
 3.5 GDP演算法實例說明
第四章 實驗及結果分析
 4.1 實驗環境
  4.1.1 實驗平台
  4.1.2 實驗資料來源
 4.2 效能說明
  4.2.1 實驗設計
  4.2.2 實驗分析
 4.3 實驗結論
第五章 結論與未來研究
 5.1 結論
 5.2 未來研究
參考文獻
[1]. "Data Organization and Access for Efficient Data Mining," in: Proceedings of the 15th International Conference on Data Engineering, IEEE Computer Society, 1999, p. 522.
[2]. Ashoka, S., Edward, O., and Shamkant, B.N. "An Efficient Algorithm for Mining Association Rules in Large Databases," in: Proceedings of the 21th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., 1995, pp. 432-444.
[3]. Berry, M.J.A., and Linoff, G.S. "Data Mining Techniques for Marketing, Sales, and Customer Support," in: John Wiley and Sons, New York, 1997.
[4]. Charly, K. "Data Mining for the Enterprise," in: Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7, IEEE Computer Society, 1998, p. 295.
[5]. Fayyad, U., Shapiro, G.P., and Smyth, P. "From Data Mining to Knowledge Discovery in Database," AI magazine Vol. 17, 1996, pp 37-54.
[6]. Fernando, B., Nicolf, and s, M. "Data mining: concepts and techniques by Jiawei Han and Micheline Kamber," SIGMOD Rec. Vol. 31, No. 2, 2002, pp 66-68.
[7]. Hannu, T. "Sampling Large Databases for Association Rules," in: Proceedings of the 22th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., 1996, pp. 134-145.
[8]. Hsiung, H.C. "Fast data mining algorithms and applications," in: Department of Information Management, Southern Taiwan University of Technology, Tainan, Taiwan, 2005.
[9]. Huang, J.-P., and Lan, G.-C. "An Efficient Algorithm for Mining Association Rules – GRA," Journal of e-Business Vol. 8, No. 4, 2006, pp 469-498.
[10]. Huang, J.-P., and Lan, G.-C. "An Efficient Algorithm for Mining Association Rules–EFI," Journal of Information Management Vol. 14, No. 2, 2007, pp 139-168.
[11]. Huang, J.P., Chen, S.J., and Kuo, H.C. "An Efficient Incremental Mining Algorithm-QSD," Intelligent Data Analysis Vol. 11(3), 2007, pp 265-278.
[12]. Huang, J.P., Chien, I.P., and Kuo, H.C. "An Efficient Incremental Mining Algorithm-ICI," Journal of e-Business Vol. 8, No. No 3, 2006 Sep, 2006, pp 393-413.
[13]. Huang, J.P., Hsiung, H.C., and Kuo, H.C. "Intuitional decompose association rule algorithm – IDA," in: Conference on System Information Management (CSIM 2004), Taichung, Taiwan, 2004.
[14]. Jiawei, H., Jian, P., Behzad, M.-A., Qiming, C., Umeshwar, D., and Mei-Chun, H. "FreeSpan: frequent pattern-projected sequential pattern mining," in: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM Press, Boston, Massachusetts, United States, 2000, pp. 355-359.
[15]. Jiawei, H., Jian, P., and Yiwen, Y. "Mining frequent patterns without candidate generation," in: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, ACM Press, Dallas, Texas, United States, 2000, pp. 1-12.
[16]. Jiawei, H., Jian, P., Yiwen, Y., and Runying, M. "Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach," Data Min. Knowl. Discov. Vol. 8, No. 1, 2004, pp 53-87.
[17]. Jiawei, H., and Yongjian, F. "Mining Multiple-Level Association Rules in Large Databases," IEEE Transactions on Knowledge and Data Engineering Vol. 11, No. 5, 1999, pp 798-805.
[18]. John F. Elder, I.V., and Dary, P. "A statistical perspective on knowledge discovery in databases," in: Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, 1996, pp. 83-113.
[19]. Jong Soo, P., Ming-Syan, C., and Philip, S.Y. "An effective hash-based algorithm for mining association rules," in: Proceedings of the 1995 ACM SIGMOD international conference on Management of data, ACM Press, San Jose, California, United States, 1995, pp. 175-186.
[20]. Ke, W., Liu, T., Jiawei, H., and Junqiang, L. "Top Down FP-Growth for Association Rule Mining," in: Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, Springer-Verlag, 2002, pp. 334-340.
[21]. Mohammed, J.Z., and Karam, G. "Fast vertical mining using diffsets," in: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM Press, Washington, D.C., 2003, pp. 326-335.
[22]. Rajanish, D., and Ambuj, M. "Frequent Pattern Mining in Real-Time – First Results," in: TDM2004/ACM SIGKDD 2004, Seattle, Washington USA, 2004.
[23]. Rajanish, D., and Ambuj, M. "An Efficient Technique for Frequent Pattern Mining in Real-Time Business Applications," in: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 3 - Volume 03, IEEE Computer Society, 2005, p. 76.71.
[24]. Rajanish, D., and Ambuj, M. "An Efficient Algorithm for Real-Time Frequent Pattern Mining for Real-Time Business Intelligence Analytics," in: Proceedings of the 39th Annual Hawaii International Conference on System Sciences - Volume 08, IEEE Computer Society, 2006, p. 170.172.
[25]. Rakesh, A., and Ramakrishnan, S. "Fast Algorithms for Mining Association Rules in Large Databases," in: Proceedings of the 20th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., 1994, pp. 487-499.
[26]. Rakesh, A., Tomasz, I., ski, and Arun, S. "Mining association rules between sets of items in large databases," in: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, ACM Press, Washington, D.C., United States, 1993, pp. 207-216.
[27]. Sergey, B., Rajeev, M., Jeffrey, D.U., and Shalom, T. "Dynamic itemset counting and implication rules for market basket data," in: Proceedings of the 1997 ACM SIGMOD international conference on Management of data, ACM Press, Tucson, Arizona, United States, 1997, pp. 255-264.
[28]. Suh-Ying, W., and Yungho, L. "An Effective Boolean Algorithm for Mining Association Rules in Large Databases," in: Proceedings of the Sixth International Conference on Database Systems for Advanced Applications, IEEE Computer Society, 1999, pp. 179-186.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔