跳到主要內容

臺灣博碩士論文加值系統

(44.201.99.222) 您好!臺灣時間:2022/12/03 23:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:左聰文
研究生(外文):Chung-Wen Cho
論文名稱:從大型資料庫更新序列型樣的有效率之演算法
論文名稱(外文):An Incremental Updating Technique for Discovering Sequential Patterns in Large Databases
指導教授:顏秀珍顏秀珍引用關係
指導教授(外文):Show-Jane Yen
學位類別:碩士
校院名稱:輔仁大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2000
畢業學年度:88
語文別:英文
論文頁數:85
中文關鍵詞:資料探勘資料挖掘序列型樣
外文關鍵詞:data miningsequential patterns
相關次數:
  • 被引用被引用:1
  • 點閱點閱:124
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
挖掘序列型樣(mining sequential patterns)的工作主要是對商家記錄交易的資料庫進行分析,以找出大部分顧客採購商品的順序行為,如此,商家便可以利用所分析出之顧客的行為,作有利於利潤獲取的決策。因為傳統的演算法在找出感興趣的購物順序之過程中,必須一遍又一遍地去掃描及分析龐大的資料庫,然而掃描及分析的動作是非常費時的,故如何減少掃描分析資料庫的次數實為挖掘序列型樣最重要的問題。
隨著新的交易被加入資料庫中,有些我們之前所找出的購物順序可能會有所改變,所以我們必須從更新後的資料庫中,重新找出所有我們感興趣的購物順序。挖掘序列型樣的更新技術(incremental updating technique for mining sequential patterns)能利用之前已找出的資訊,花費較少的時間,對更新過後的資料庫,重新找出我們所感興趣的購物順序。如此,便不用費時的去重新執行挖掘序列型樣的演算法。
在這篇論文的前半段,我們首先提出兩個挖掘序列型樣的演算法SLP和SSLP,用以比較其它挖掘序列型樣的演算法。其中,SLP和SSLP可以減少掃描資料庫的次數,而SSLP亦可以提前免去計算一些最後不是我們所感興趣的購物順序,或者提前找出感興趣的購物順序。實驗結果顯示我們的演算法比其它解決相同問題的演算法更有效率。在這篇論文的後半段,我們根據SSLP提出一個挖掘序列型樣的更新技術之演算法-USSLP,USSLP能利用既有的資訊,有效率的從更新之後的資料庫中,找出所有更新之後的購物順序。實驗結果顯示,對於更新後的資料庫,USSLP比重新利用挖掘序列型樣的演算法來找尋所感興趣的購物順序還要有效率。

Mining sequential patterns is to discover sequential purchasing behaviors of most customers from a large amount of customer transactions. The strategy of mining sequential patterns focuses on discovering frequent sequences. A frequent sequence is a sequence of the itemsets which purchased by a sufficient number of customers.
The previous approaches for mining sequential patterns need to repeatedly scan the large database, and take a large amount of computation time to find frequent sequences, which are very time consuming. Therefore, it is important to reduce the number of database scans and decrease the amount of computations to improve the efficiency of those mining algorithms.
Moreover, the discovered sequential patterns may be changed. Since the new transactions progress continuously, and the obsolete transactions need to be removed. In order to keep new information, the sequential patterns have to re-discovered from the updated database periodically, which is very costly.
In this thesis, we present two algorithms SLP and SSLP to find sequential patterns, which can significantly reduce the number of the database scans, and decrease the computation time of finding sequential patterns. Besides, we also propose an incremental updating technique to update the discovered sequential patterns when the database is updated. The algorithm USSLP is presented to maintain the discovered sequential patterns. USSLP makes use of the discovered information and just considers the updated customer transactions to find all the sequential patterns. The experimental results show that our algorithms are more efficient than the other algorithms for mining sequential patterns.

第一章 導論 1
1.1 挖掘序列型樣的問題描述 1
1.2 挖掘序列型樣的更新技術之問題描述 3
1.3 相關工作 4
第二章 挖掘序列型樣的演算法 6
2.1 演算法SLP (Smallest and Largest Position) 8
2.2 演算法SSLP (Segmental Smallest and Largest Position) 12
第三章 挖掘序列型樣的更新技術 18
3.1 演算法 USSLP (Updating Smallest and Largest Position) 22
第四章 實驗結果與討論 34
4.1 挖掘序列型樣的演算法比較 34
4.2 更新序列型樣演算法的比較 35
第五章 結論與未來工作 39
Chapter 1. Introduction 41
1.1 The Problem Description of Mining Sequential Patterns 41
1.2 The Problem Description of Incremental Updating for Mining Sequential Patterns 43
1.3 Related Work 43
Chapter 2. Mining Sequential Patterns 45
2.1 Algorithm SLP (Smallest and Largest Position) 47
2.2 Algorithm SSLP (Segmental Smallest and Largest Position) 52
Chapter 3. Incremental Updating for Mining Sequential Patterns 58
3.1 Algorithm USSLP (Updating Segmental Smallest and Largest Position) 61
Chapter 4. Experimental results 76
4.1 Performance Evaluation for SLP and SSLP 76
4.2 Performance Evaluation for USSLP 77
Chapter 5. Conclusion and Future Work 82
Bibliography 83

[1] Anthony K.H. Tung, Hongjun Lu, Jiawei Han, Ling Feng : Breaking the Barrier of Transactions : Mining Inter-Transaction Association Rules. KDD 1999: 297-301.
[2] Ashok Savasere, Edward Omiecinski, Shamkant Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proceedings of 21st VLDB Conference Zurich, Swizerland,1995
[3] C.H. Cai, Ada Wai-Chee Fu, C.H.Cheng, W.W.Kwong: Mining Association Rules with Weighted Items. IDEAS 1998: 68-77.
[4] Chung-Hong Lee, Hsin-Chang Yang A Web Text Mining Approach Based on Self-organizing Map. In Workshop on Web Information and Data Management 1999, pages 59-62.
[5] David W. Cheung, Jiawei Han, Vincent T. Ng and C.Y. Wong. Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique.In ICDE 1996, page 106-114.
[6] David W. Cheung, S.D. Lee and Benjamin Kao. A General Incremental Technique for Maintaining Discovered Association Rules. In DASFAA 1997, page 185-194.
[7] J.S. Park, M.S. Chen, and P.S. Yu. An Effective Hash-Based Algorithm for Mining Association Rules. In Proceedings of ACM SIGMOD,24(2):175-186, 1995
[8] Ming-Syan Chen. Efficient Data Mining for Path Traversal Patterns. In IEEE Transactions on Knowledge and Data Engineering 1998, pages 209-220
[9] R. Agrawal and et al. Fast Algorithm for Mining Association Rules. In Proceedings of International Conference on Very Large Data Bases, pages 487-499, 1994.
[10] Rakesh Agrawal, and et al. Mining Sequential Patterns. In Proceedings of International Conference on Data Engineering, pages 3-14,1995.
[11] Rakesh Agrawal, and et al. Mining Generalized Association Rules. In Proceedings of the 21st VLDB Conference Zurich, swizerland, 1995.
[12] Rakesh Agrawal, and et al. Mining Sequential Patterns: Generalizations and Performance Improvements. In Proc. of the Fifth Int'l Conference on Extending Database Technology (EDBT), Avignon, France, March 1996.
[13] Rakesh Agrawal, Johannes Gehrke: Dimitrios Gunopulos, Prabhakar Raghavan. Automatic Subspace Clustering of High Dimensional Data for Data Minig Applications. In Proc. of the ACM SIGMOD Int'l Conference on Management of Data, Seattle, Washington, June 1998.
[14] R. Agrawal, K. Lin, H. S. Sawhney, K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. Proc. of the 21st Int'l Conference on Very Large Databases, Zurich, Switzerland, September 1995.
[15] R. Agrawal, R. J. Bayardo Jr. and R. Srikant. Athena: Mining-based Interactive Management of Text Databases. IBM Research Report RJ10153, July 1999.
[16] R. J. Bayardo Jr. and R. Agrawal. Minig the Most Interesting Rules. In Proc. of the 5th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, August 1999.
[17] Sergey Brin, Rajeev Motwani, and et al. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proceedings of ACM SIGMOD,1997
[18] Shiby Thomas, Sreenath Bodagala, Khaled Alsabti and Sanyay Ranka. An Efficient Algorithm for the Incremental Updation of Association Rules in Large Databases. In KDD 1997, page 263-2.
[19] Show-Jane Yen and Arbee L.P. Chen. An Efficient Approach to Discovering Knowledge from Large Databases. In PDIS, page 8-18,1996
[20] Show-Jane Yen Mining Frequent Traversal Patterns in a Web Environment. In Proceedings of International Symposium on Intelligent Data Engineering and Learning, pages. 219-224, 1998.
[21] Suh-Ying Wur, Yungho Leu: An Effective Boolean Algorithm for Mining Association Rules in Large Databases. DASFAA 1999: 19-30.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊