跳到主要內容

臺灣博碩士論文加值系統

(44.220.255.141) 您好!臺灣時間:2024/11/15 02:22
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:許維修
研究生(外文):Hsu Wei-Hsiu
論文名稱:Iprefix-growth限制型序列型樣探勘演算法
論文名稱(外文):An Constraint-based Sequential Pattern Mining Algorithm - Iprefix-growth Algorithm
指導教授:許清琦許清琦引用關係郭大維郭大維引用關係
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:中文
中文關鍵詞:序列型樣序列型樣探勘限制
外文關鍵詞:sequential patternsequential pattern miningconstraint
相關次數:
  • 被引用被引用:0
  • 點閱點閱:256
  • 評分評分:
  • 下載下載:18
  • 收藏至我的研究室書目清單書目收藏:3
序列型樣探勘(sequential pattern mining)是一項重要的資料探勘問題,其具有非常廣泛的應用。包含顧客購買行為的分析ヽ網路存取網頁的分析ヽ科學實驗的分析ヽ疾病治療分析ヽ天然災害分析ヽDNA序列分析等等…。許多資料探勘相關演算法都著重正確率和效率,而資料探勘演算法加上限制條件後,將更能呈現使用者的需求。另外限制型序列型樣探勘(constraint-based sequential pattern mining)在探勘過程中只需探勘符合某些特定情況的型樣。所以只要把限制條件融入探勘過程中,將會同時顧及到正確率和效率。也因此許多實務應用上,在序列型樣探勘中加上限制條件是必須的。近幾年來有愈來愈多人研究有效率的序列型樣探勘演算法及其應用,然而限制型序列型樣探勘卻仍然缺少系統性的相關研究。近年來,Jiawei Han等人提出了一個限制型序列型樣探勘演算法,prefix-growth演算法,其使用前序單調性質(prefix-monotone property)涵蓋大部分的限制條件。本篇論之Iprefix-growth演算法,修正了prefix-growth 演算法架構,以下限檢查法(lowbound-checking method)和分段檢查法(split-checking method)提昇限制型序列型樣探勘演算法處理單調限制(monotonicity)ヽ部分非單調限制(anti-monotonicity)以及規則表示限制(RE constraint)的執行效率。論文最後再與prefix-growth演算法比較探勘速度並且歸納出本演算法與prefix-growth演算法不同的特性。
Sequential pattern mining is an important data mining problem with broad applications, such as the analyses of customer purchase behavior, Web access patterns, DNA sequences, and so on. Many data mining algorithms emphasize efficiency and effectiveness, while sequential pattern mining algorithms with constraints will represent users’ requests better. Additionally, it searches only on mining patterns that match the constraints. As the result, constraint-based sequential pattern mining will take both efficiency and effectiveness into account, so it is necessary to combine sequential pattern mining and constraints in many practical applications. While a great number of papers have been written on the efficient sequential pattern mining algorithms and their applications, many of them entirely fail to consider systematic study of constraint-based sequential pattern mining. In Jiawei Han et al’s paper, they propose prefix-growth algorithm that is a constraint-based sequential pattern mining algorithm, by covering a great part of the constraints with prefix-monotone property. Our algorithm, Iprefix-growth algorithm, modifies the framework of prefix-growth algorithm, which deals with constraint-based sequential pattern mining by improving its efficiency on monotonic constraints, some anti-monotonic constraints, and RE constraints with lowbound-checking method and split-checking method. Finally, we will compare our algorithm with prefix-growth algorithm on mining speed and different properties, respectively.
Chapter 1 Introduction
1.1 Motivation
1.2 Related Works on Constraint-based Sequential Pattern Mining
1.3 Purposes of This Study
1.4 Overview of Our Study
Chapter 2 Related Studies
2.1 Description of The Problem
2.2 Classification of Constraints
2.2.1 Classification of Constraints on Applications
2.2.2 Classification of Constraints From a Classical Framework
2.2.3.Prefix-Monotone Property
2.3 GSP Algorithm
2.4 Prefix-Growth Algorithm
2.4.1 Related Definition and Prefix-projected Mining Algorithm
2.4.2 Bi-Level Projection Method
2.2.3 Prefix-Growth Algorithm
Chapter 3 Iprefix-growth Algorithm
3.1 Iprefix-growth Algorithm
Chapter 4 The Experiments
Chapter 5 Conclusions and Future Works
5.1 Conclusions
5.2 Future Works
Bibliography
[1] R. Agrawal and R. Srikant, ''Fast Algorithms for Mining Association Rules in Large Databases,'' Proc. of the 20th Int’l Conference on Very large Databases, pp.487-499, Santiago, Chile, Sep 1994.
[2] R. Agrawal, C. Faloutsos, and A. Swami, "Efficient Similarity Search in Sequence Databases," Proc. of the 4th Int'l Conference on Foundations of Data Organization and Algorithms, Chicago, Oct. 1993, Also in Lecture Notes in Computer Science 730, Springer Verlag, 1993, 69-84.
[3] R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim, "Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases," Proc. of the 21st Int'l Conference on Very Large Databases, Zurich, Switzerland, September 1995.
[4] R. Agrawal, R. Srikant, "Mining Sequential Patterns," Proc. of the Int'l Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995. Expanded version available as IBM Research Report RJ9910, October 1994.
[5] G. Berger, A. Tuahilin, "Discovering Unexpected Patterns in Temporal Data Using Temporal Logic," Temporal Database Research and Practice, Lecture Notes on Computer Sciences, (1399) 281-309, 1998.
[6] Bettini, C., Wang, X.S., Jajodia, S.,and Lin, J.-L., "Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences," IEEE Transactions on Knowledge and Data Engineering, Volume: 10 Issue: 2 , March-April 1998 Page(s): 222 —237.
[7] G. Grahne, L. Lakshmanan , and X. Wang, "Efficient Mining of Constrained Correlated Sets," Proc. 2000 Int Conf Data Engineering (ICDE’00), San Diego, CA, February 2000.
[8] M. Garofalakis, R. Rastogi, and D. Shim, ''SPIRIT: Sequential Pattern Mining with Regurlar Expression Constraints,'' Proc. 1999 Int. Conference on Very large Databases (VLDB'99), pp.223-234, Edinburgh, UK, Sep. 1999.
[9] J. Han, G. Dong , and Y. Yin, "Efficient Mining Partial Periodic Patterns in Time Series Database," Proc. ICDE, 106-115, 1999.
[10] J. Han, Lakshmanan L., Ng R, "Constraint-based Multidimensional Data Mining," IEEE Computer, Vol. 32, No. 8, 1999.
[11] J. Han, J. Pei, "Pattern Growth Methods for Sequential Pattern Mining: Principles and Extensions," (invited paper), Proc. ACM SIGKDD 2001 Workshop on Temporal Data Mining, San Francisco, California, USA, August 26, 2001.
[12] J. Han, J. Pei, B. Mortazavi-Asi, Q. Chen, U. Dayal, and M.C. Hsu, ''FreeSpan: Frequent Pattern-projected Sequential Pattern Mining,'' KDD'00.
[13] J. Han, J. Pei, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, ''PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth,'' Proc. of 2001 Int’l Conference on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001.
[14] J. Han, J. Pei , and Y. Yin, "Mining Frequent Patterns without Candidate Generation," Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'00), Dallas, TX, May 2000, pp. 1-12.
[15] T. Imielinski, H. Mannila, "A Database Perspective on Knowledge Discovery," Communications of the ACM, Vol. 39, No.11, 1996.
[16] L. V. S. Lakshmanan, R. Ng, J. Han, and A. Pang. ''Optimization of Constrained Frequent Set Queries with 2-variable Constraints,'' Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’99), pages 157-168 Philadelphia, PA, June 1999.
[17] S. Li, H. Shen, and L. Cheng, "New Algorithms for Efficient Mining of Association Rules," Information Sciences, Vol. 118, No. 1-4, Sep. 1999, pp. 251-268.
[18] H. Mannila, H. Toivonen, and A. I. Verkamo, ''Discovering Frequent Episodes in Sequences,'' Proc. of the Int. Conf. on Knowledge Discovery in Databases and Data Mining (KDD-95), Montreal, Canada, August. 1995.
[19] R. T. Ng, L. V.S. Lakshmanan, J. Han, and A. Pang, ''Exploratory Mining and Pruning Optimizations of Constrained Association Rules,'' Proc. of the 1998 ACM SIGMOD Int'l Conf. on Management of Data, June 1998.
[20] J. Pei , J. Han, "Constrained Frequent Pattern Mining: A Pattern-Growth View," ACM SIGKDD Explorations (Special Issue on Constraints in Data Mining), June 2002.
[21] J. Pei, J. Han, and Wei Wang, "Mining Sequential Patterns With Constraints in Large Databases," CIKM 2002: 18-25.
[22] R. Srikant, R. Agrawal, ''Mining Sequential Patterns: Generalization and Performance Improvements,'' Proc. 5th Int. Conf. Extending Database Technology (EDBT'96), pp.3-17, Avignon, France, Mar. 1996.
[23] A. Savasere, E. Omiecinski, and S. Navathe, "An Efficient Algorithm for Mining Association Rules in Large Databases," Proc. Int'l Conf. Very Large Data Bases, Zurich, Switzerland, Sep. 1995, pp. 432-444.
[24] R. Srikant, Q. Vu, and R. Agrawal, ''Mining Association Rules with Item Constraints,'' Proc. 3th Int. Conf. on Knowledge Discovery and Data Mining, August. 1997.
[25] K. Wang, Y. He, and J. Han, "Pushing Support Constraints Into Association Rules Mining," IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 3, May/June 2003.
[26] K. Wang, Y. Jiang, J. X. Yu, G. Dong, and J. Han, "Pushing Aggregate Constraints by Divide-and-Approximate," The IEEE International Conference on Data Engineering, 2003, Bangalore, India.
[27] Show-Jane Yen, and Arbee L.P. Chen, "An Efficient Approach to Discovering Knowledge from Large Databases," In PDIS, pages 8-18, 1996.
[28] X. Yan, J. Han, "CloseGraph: Mining Closed Frequent Graph Patterns," Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
[29] J. Yang, W. Wang, and P. Yu, "Mining Asynchronous Periodic Patterns in Time Series Data," Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD), pp.275-279, 2000.
[30] J. Yang, W. Wang, P. Yu, and Jiawei Han, "Mining Long Sequential Patterns in A Noisy Environment," to appear in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 2002.
[31] J. Yang, W. Wang, and P. Yu, "InfoMiner: Mining Surprising Periodic Patterns," IBM Research Report, 2001.
[32] J. Yang, W. Wang, and P. Yu, "Mata-Patterns: Revealing Hidden Periodic Patterns," IBM Research Report, 2001.
[33] J. Yang, W. Wang, and P. Yu, "Mining Patterns in Long Sequential Data with Noise," IBM Research Report, 2001.
[34] Quest Synthetic Data Generation Code http://www.almaden.ibm.com/cs/quest/syndata.html#instructions
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 王業立,1999,「總統選舉制度變革的可能效應與影響」,國策專刊,8:15-17。
2. 王業立,2002,「國會中的政黨角色與黨團運作」,月旦法學,86:82-97。
3. 王業立,1996,「相對多數 vs. 絕對多數:各國總統直選方式的比較研究」,選舉研究,3(1) : 49-67。
4. 石之瑜,1997,「權威人格與雙首長制」,美歐季刊,12(4): 25-60。
5. 朱雲漢,1993,「法國憲政體制對我國憲改的啟示」,國家政策雙周刊,73: 3-8。
6. 李念祖,2000,「憲政主義在台灣的發展與政治影響-憲法取代國王權威的半世紀回顧」,法令月刊(創新五十周年紀念特刊),51(10): 162。
7. 李春福,2000,『法國「半總統制」與「左右共治」經驗之啟示』,軍法專刊,46(2): 30-38。
8. 吳東野,1996,『「半總統制」之探討』,美歐月刊,11(1): 72-85。
9. 吳重禮,2000,『「美國」分立性政府』研究文獻之評析:兼論台灣地區的政治發展』,問題與研究,39(3):75-101。
10. 李鎨澂,2002,「2002法國總統大選對其憲政發展之意義」,國家政策論壇,2(6):204-209。
11. 林佳龍,1999,「總統選制的選擇與效應」,新世紀智庫論壇,6: 44-65。
12. 周陽山,1995,『「半總統制」概念及其實施經驗的反思─芬蘭模式的探討』,美歐月刊,10(5):67-78。
13. 周陽山,1996,「總統制、議會制、半總統制與議會穩定」,問題與研究,35(8):50-61。
14. 徐正戎,1995,「法國第五共和總統權限之剖析及其演變」,法學叢刊,158:93-108。
15. 徐正戎,2000,『「左右共治」-「雙首長制」之宿命』,政策月刊,59:8-12。