(3.238.186.43) 您好!臺灣時間:2021/02/26 12:36
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:張衡閣
研究生(外文):Heng-Ke Chang
論文名稱:一個資料庫多維度序列法則探勘方法
論文名稱(外文):A New Method of Multi-Dimensional Sequential Rules Mining from Databases
指導教授:張簡尚偉張簡尚偉引用關係
指導教授(外文):Shang-Wei Changchien
學位類別:碩士
校院名稱:朝陽科技大學
系所名稱:資訊管理系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:82
中文關鍵詞:資料探勘多維度序列法則約略集合
外文關鍵詞:Multi-Dimensional Sequential RuleRough SetData Mining
相關次數:
  • 被引用被引用:18
  • 點閱點閱:364
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:37
  • 收藏至我的研究室書目清單書目收藏:2
近年來,資料探勘相關的研究領域越趨熱絡,除了關聯規則的探勘外,也有學者致力於探討時間因子的關聯規則,大致可區分為消費者購買行為分析、網頁瀏灠分析,以及時間趨勢分析等。而在探討消費者購買行為的順序性上,多藉著候選序列的產生及驗証,以漸進的過程來產生常見序列型樣,然而驗証的過程必須重複掃瞄資料庫,也因此造成系統沈重的負擔而導致效率不佳。本研究提出以“常見型樣相鄰矩陣”來記錄序列的顧客群,在已知常見型樣的前提下,只需再掃瞄資料庫一次即可,而且不需產生候選序列,可以改進常見序列型樣探勘的效率。
然而型樣的存在通常與發生的環境有高度的關聯性,因此必須從很多不同的角度去審視該環境的因素。例如考慮顧客購買商品時的順序關係時,可能還必須把地點、時間、天氣、顧客類別等等客觀因素納入考量,如此探勘所得的序列型樣更能夠符合實際的情況。
在本篇論文中,我們將以常見型樣相鄰矩陣記錄事件的方法為序列探勘演算法,並以約略集合理論(Rough Set Theory)為多維度分析的基礎,只需掃瞄一次資料庫用以對約略集合(Rough Set)建立索引後,結合序列型樣與多維度型樣加以運算,即可求得多維度序列型樣。
Data Mining has become one of the fast growing areas of research in recent years. Besides association rules mining, researchers endeavor to develop mining methods with time factor considered. Popular research topics include customers buying patterns analysis, Internet surfing sequence analysis, trend analysis, and so on. When probing the customers buying sequential patterns, most developed mining methods require repeated database scans to generate candidate patterns, which are then checked to find frequent sequential patterns. It therefore deteriorates the performances of these methods. This paper presents a Frequent Pattern Adjacent Matrix (FPAM) to record intermediate length-2 patterns. After finding the frequent patterns, it only needs one more round of database scan to find all the sequential patterns by taking advantages of FPAM. Without generating unnecessary patterns, the proposed method is an efficient method for mining frequent sequential patterns from databases.
However, the existence of patterns is often related to the circumstances or conditions. A circumstance has to be considered in different views. For example, when a customer buys a product, not only the priority of purchasing, but variables such as region, time, climate and customer category should be also taken into account. A more applicable sequential pattern to the real situation can therefore be mined.
In this paper, we embedded FPAM as the algorithm for sequential patterns mining. Furthermore, we applied Rough Set Theory for multi-dimensional analysis. After a construction of Rough Set index structure, sequential and multi-dimensional patterns are combined to obtain multi-dimensional sequential patterns. With this kind of approach, when given frequent patterns, we can enhance the efficiency of data mining by rescanning the database only once.
中文摘要 I
Abstract II
誌 謝 IV
目 錄 V
圖 目 錄 VIII
表 目 錄 IX
第一章 緒論 1
1.1 資料探勘簡介 2
1.2 問題描述 6
1.3 研究動機 8
1.4 研究目的 10
1.5 論文架構 10
第二章 相關研究 11
2.1 關聯法則的探勘方法 11
2.1.1 Apriori 演算法 13
2.1.2 DHP 演算法 15
2.1.3 Partition 演算法 15
2.1.4 FP-tree 演算法 16
2.1.5 FPL 演算法 17
2.2 順序性資料的探勘方法 20
2.2.1序列型樣探勘 21
2.2.1.1 Apriori-based 演算法 22
2.2.1.2 PSP演算法 25
2.2.1.3 FreeSpan演算法 25
2.2.1.4 PrefixSpan演算法 27
2.2.1.5 SPADE演算法 29
2.2.2 時間序列資料庫探勘 31
2.2.3 網頁存取紀錄探勘 31
2.3 多維度循序型樣的探勘方法 31
第三章 利用常見型樣相鄰矩陣來探勘資料庫序列型樣 33
3.1 主要探勘步驟 33
3.2 演算法及範例說明 37
3.2.1建置常見型樣相鄰矩陣 37
3.2.2 常見相鄰型樣矩陣(FPAM)的探勘過程 41
3.3 實驗結果與討論 44
第四章、多維度序列型樣探勘 48
4.1 主要步驟與架構 53
4.2 詳細步驟說明及範例 55
4.3 實驗結果與討論 59
第五章 結論 62
[1]王皓正,「時間序列資料之查詢與資料發掘--以台灣股市為例」,碩士論文,國立台灣大學資訊管理研究所,台北(2000)
[2]李金鳳,「資料探勘面面觀」,資訊與教育雜誌,台北(2001)
[3]張簡尚偉、張衡閣,「利用常見型樣相鄰矩陣來探勘資料庫序列型樣」, 第十三屆國際資訊管理學術研討會,台北,第729-736頁(2001)
[4]陳完禧、蔡佩瑾、周恩聖,”網際網路網頁存取樣式資料探勘之研究”, 第七屆資訊管理研究暨實務研討會,民國91年
[5]R. Agrawal, T. Imielinski, & A. Swami, “Mining Association Rules between Sets of Items in Large Database,” Proceedings of SIGMOD, Washington, USA, pp. 207-216 (1993).
[6]R. Agrawal, T. Imielinski and A. Swami, “Database Mining: A Performance Perspective,” IEEE Transactions on Knowledge and Data Engineering, pp. 914-925 (1993).
[7]R. Agrawal, C. Faloutsos and A. Swami, “Efficient Similarity Search in Sequence Databases,” Lecture Notes in Computer Science 730, Springer Verlag, pp. 69-84 (1993).
[8]R. Agrawal and R.Srikant, “Fast Algorithm for Mining Association Rules in Large Databases,” Proceedings of The 20th International Conference on Very Large DataBases, Santiago, Chile, pp. 487-499 (1994).
[9]R. Agrawal and R.Srikant, “Mining sequential patterns,” Proceedings of The International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14 (1995).
[10]K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. Proceedings of ACM-SIGMOD International Conference on Management of Data, Philadelphia, USA, pp. 359-370 (1999).
[11]M.S. Chen, J. Han, and P.S. Yu, “Data Mining: An Overview from a Database Perspective”, IEEE Transactions on Knowledge and Data Engineering, Vol 8, pp.866-883 (1996).
[12]M. S. Chen, J. Han and P. S. Yu, “Efficient Data Mining for Path Traversal Patterns,” IEEE Transactions on Knowledge and Data Engineering, Vol.10, No. 2, pp. 209-221 (1998).
[13]A. Chen, L. Liu, N. Chen and G. Xia, “Application of Data Mining in Supply Chain Management,” Proceedings of the 3rd World Congress on Intelligent Control and Automation, Hefei, China, pp.1943-1947 (2000).
[14]S.W. Changchien and T. C. Lu (2001), “A new efficient association rules mining method using class inheritance tree,” Proceedings of the 12th International Conference of Information Management, Taipei, Taiwan, (2001).
[15]R. Cooley, B. Mobasher and J. Srivastava, “Web Mining: Information and Pattern Discovery on the World Wide Web,” Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’97), Newport Beach, USA, pp. 558-576 (1997).
[16]P. F. Drucker, Post-Capitalist Society, HarperCollins Publishers (1993). 【P. F. Drucker, 傅振焜譯,後資本主義社會,時報文化,1993】
[17]C. Faloutsos, M. Ranganathan and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” Proceedings of ACM SIGMOD International Conference on Management of Data, Minneapolis, USA, pp. 419-429 (1994).
[18]U.M. Fayyad, G. Piatesky-Shapiro, P. Smith, and R. Uthurusany, “Advances in Knowledge Discovery and Data Mining,” Cambridge, MA(The AAAI Press/The MIT Press), (1996).
[19]W. J. Frawley, G. Paitetsky-Shapiro, and C. J. Matheus, “Knowledge Discovery in Databases: An Overview. Knowledge Discovery in Databases, ” edited by G. Piatetsky-Shapiro and W. J. Frawley, AAAI/MIT Press, California, USA, pp.1-30 (1991).
[20]A. W. Fu, M. H. Wong, S. C. Sze, W. C. Wong, W. L. Wong, and W. K. Yu, “Finding fuzzy sets for the mining of fuzzy association rules for numerical attributes,” Proceedings of 1st International Symposium on Intelligent Data Engineering and Learning (IDEAL'98), pp. 263-268 (1998).
[21]J. Han, L. V. S. Lakshmanan and R. T. Ng, “Constraint-Based, Multidimensional Data Mining,” IEEE Computer, Vol. 32, pp. 46-50 (1999).
[22]J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M-C. Hsu, “Freespan: Frequent pattern-projected sequential pattern mining,” Proceedings of the International Conference of Knowledge Discovery and Data mining, pp. 355-359 (2000).
[23]J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann,San Francisco, (2001).
[24]J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,“ Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, USA, pp. 241-250 (2000).
[25]T. P. Hong, C. S. Kuo, S. C. Chi, “Mining Association Rules From Quantitative data,” Intelligent Data Analysis, Vol. 3, pp. 363-376, (1999)
[26]M. Kamber, J. Han, and J. Y. Chiang, “Metarule-guided mining of multi-dimensional association rules using data cubes,” Proceedings of the International Conference of Knowledge Discovery and Data Mining(KDD’97), Newport Beach, USA, pp. 207-210 (1997).
[27]M. Klemettinen, H. Mannila and H. Toivonen, “Interactive Exploration Of Interesting Findings In The Telecommunication Network Alarm Sequence Analyzer(TASA),” Information and Software Technology, Vol. 41, No. 9, pp. 557-567 (1999).
[28]H. Mannila, H. Toivonen, and A. I. Verkamo, “Discovery Of Frequent Episodes In Event Sequences,” Data Mining and Knowledge Discovery, No.1, pp.259-289 (1997).
[29]F. Masseglia, F. Cathala and P. Poncelet, “The PSP Approach for Mining Sequential Patterns.” Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery, Nantes, France, Vol 1510, pp. 176-184 (1998).
[30]J. S. Park, M. S. Chen, and P. S. Yu, “An effective hash based algorithm for mining association rules,” Proceedings of the ACM SIGMOD International Conference on Management of Data, San Jose, USA, pp. 175-186 (1995).
[31]Z. Pawlak, “Rough Set,” International Journal of Information and Computer Sciences, Vol.11, No.1, pp. 341-356 (1982).
[32]J. Pei, J.Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U.Dayal and M-C. Hsu, “PrefixSpan: Mining Sequential Patterns Efficiently by Prefix Prejected Pattern Growth,” Proceeding of the International Conference of Data Engineering, H eidelberg, Germany, pp. 215-224 (2001).
[33]G. Piatetsky-Shapiro, “Discovery, Analysis, and Presentation of Strong Rules,” Knowledge Doscovery in Databases, AAAI/MIT Press (1991).
[34]H. Pinto, J. Han, J. Pei, K. Wang, Q. Chen, and U. Dayal. "Multi-Dimensional Sequential Pattern Mining", Proceedings of the 10th ACM International Conference on Information and Knowledge Management (CIKM'01), Atlanta, USA, pp. 81-88 , (2001).
[35]A. Ragel, B. Cremilleux, “MVC—a preprocessing method to deal with missing values,” Data & Knowledge Engineering Volume: 18, Issue: 3, pp. 189-223 (1996).
[36]A. Savasere, E. Omiecinski, and S. Navathe, “An Efficient Algorithm for Mining Association Rules,” Proceedings of the 21st Conference of Very Large Databases(VLDB), pp. 432-444 (1995).
[37]A. Siberschatz and A. Tuzhilin, “On subjective Measures of Interestingness in Knowledge Discovery,” Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining, Menlo Park, USA, pp. 275-281 (1995).
[38]A. Siberschatz and A. Tuzhilin, “What Makes Patterns Interesting in Knowledge Discovery Systems,” IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No.6, pp. 970-974, (1996).
[39]R. Srikant and R. Agrawal, “Mining generalized association rules,” Proceedings of the 21st Conference of Very Large Databases(VLDB), Zurich, Switzerland, pp. 407-419 (1995).
[40]F. C. Tseng and C. C. Hsu, “Generating frequent patterns with the frequent pattern list,” Proceedings of the Asia Pacific Conference of Data Mining and Knowledge Discovery, Hong Kong, China, pp. 376-386 (2001).
[41]C. Westphal and T. Blaxton, Data Mining Solutions-Methods and Tools for Solving Real-World Problems, John Wiley & Sons (1998).
[42]M. J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,” Proceeding of Machine Learning Journal, special issue on Unsupervised Learning, Vol. 42 Nos. 1/2, pp. 31-60 (2001).
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔