跳到主要內容

臺灣博碩士論文加值系統

(44.200.27.215) 您好!臺灣時間:2024/04/20 08:23
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:許仁豪
研究生(外文):Jen-Hao Hsu
論文名稱:部分片段效益挖掘之研究
論文名稱(外文):A Study of Partial Periodic Utility Mining
指導教授:洪宗貝洪宗貝引用關係
指導教授(外文):Tzung-Pei Hong
學位類別:碩士
校院名稱:國立中山大學
系所名稱:資訊工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:88
中文關鍵詞:資料挖掘高效益部分週期樣式投影效益上界
外文關鍵詞:data mininghigh utilitypartial periodic patternprojectionutility upper bound
相關次數:
  • 被引用被引用:0
  • 點閱點閱:96
  • 評分評分:
  • 下載下載:4
  • 收藏至我的研究室書目清單書目收藏:0
大部分現行的部分週期樣式探勘之研究都只考慮樣式在週期片段資料中的出現頻率來決定樣式的重要性,並假設每個事件的效益值是一樣的。因此,使用傳統部分週期樣式探勘方法將使得一些具高效益但卻出現頻率較低的事件項目不易被挖掘出來。在本論文中,我們將原始問題擴展到高效率部分週期性樣式挖掘,其不僅考慮事件的發生時間順序和周期長度,而且還考慮了它們的數量和利潤。我們設計了一個週期效用函數,並且基於此函數我們提出了三種挖掘高效益部分週期樣式的演算法。第一個方法使用了兩階段週期效益上界模型為基礎,以避免在挖掘過程中的資訊遺失,它並可以作為實驗比較的基礎。第二個方法則藉著逐漸收縮效益上界值來進一步增進演算法的效率。第三個方法則採用了投影技巧來避免不必要的檢查及減少執行時間。最後,在各種參數設置下對這三種算法的性能進行實驗比較而實驗結果顯示投影方法在這三種方法中表現最好。
The existing studies related to partial periodic pattern mining only consider the frequency of patterns in periodic segment data to determine their significance, and the same utility is assumed for all events. Thus, some events with high utility but low frequency may not be found by using traditional partial periodic pattern mining techniques. In this thesis, we extend the original problem to high-utility partial periodic pattern mining (HUPPP), which considers not only the occurring time order and periodic length of events but also their quantities and individual profits. We have designed a periodic utility function, and based on it we have proposed three mining algorithms for finding high-utility partial periodic patterns. The first one is the basic algorithm that uses the two-phased periodic utility upper-bound (PUUB) model to avoid information loss in the mining process. It can be used as the ground-truth for experimental comparison. The second one further improves the efficiency by using the gradually pruning algorithm to shrink the utility upper-bounds. The third one adopts the projection technique to avoid unnecessary checking and reduce execution time. Finally, experiments are made to compare the performance of the three proposed algorithms under various parameter settings. Experimental results show the projection approach has the best performance among them.
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
Contents v
List of Tables vi
List of Figures vii
Chapter 1 Introduction 1
1.1 Background 1
1.2 Contribution 3
1.3 Thesis Organization 5
Chapter 2 Related Works 6
2.1 Sequential pattern mining 6
2.2 Utility pattern mining 10
2.3 Periodic pattern mining 11
Chapter 3 The Proposed Algorithm 14
3.1 Definition 14
3.2 The High Utility Periodic Pattern Mining Algorithm (HUPPP) 23
3.2.1 HUPPP 24
3.2.2 An Example of HUPPP 28
3.3 The High Utility Periodic Pattern Mining with Gradually Pruning Algorithm (GPA) 36
3.3.1 GPA 37
3.3.2 An Example of the GPA 42
3.4 The High Utility Periodic Pattern Mining with Projected Database Algorithm 51
3.4.1 Projected Database Algorithm 52
3.4.2 An Example of the Projected Database Algorithm 58
Chapter 4 Experiments 68
4.1 Experimental Environment 68
4.2 Experimental Evaluation 69
Chapter 5 Conclusion and Future Work 75
5.1 Conclusion 75
5.2 Future Work 76
References 77
[1]R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” The 20th International Conference on Very Large Data Bases, pp. 487-499, 1994.
[2]R. Agrawal and R. Srikant, “Mining sequential patterns,” The 11th International Conference on Data Engineering, pp. 3-14, 1995.
[3]J. Ayres, J. Flannick, J. Gehrke and T. Yiu, “Sequential pattern mining using a bitmap representation,” The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429-435, 2002.
[4]S. Aseervatham, A. Osmani and E. Viennet, “bitSPADE: A lattice-based sequential pattern mining algorithm using bitmap representation,” The Sixth International Conference on Data Mining, pp. 792-797, 2006.
[5]C. F. Ahmed, S. K. Tanbeer and B. S. Jeong, “A novel approach for mining high-utility sequential patterns in sequence databases,” Electronics and Telecommunications Research Institute journal, vol. 32, pp. 676-686, 2010.
[6]M. S. Chen, J. Han and P. S. Yu, “Data mining: an overview from a database perspective,” IEEE Transactions on Knowledge and data Engineering, vol. 8, no. 6, pp. 866-883, 1996.
[7]A. Erwin, R. P. Gopalan and N. R. Achuthan, “CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach,” The Seventh International Conference on Computer and Information Technology, pp. 71-76, 2007.
[8]Frequent itemset mining dataset repository, http://fimi.ua.ac.be/data/, 2012.
[9]P. Fournier-Viger, T. Gueniche and V. S. Tseng, “Using partially-ordered sequential rules to generate more accurate sequence prediction,” The International Conference on Advanced Data Mining and Applications, pp. 431-442, 2012.
[10]P. Fournier-Viger, C. W. Wu, S. Zida and V. S. Tseng, “FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning,” The 21st International Symposium on Methodologies for Intelligent Systems, pp. 83-92, 2014.
[11]P. Fournier-Viger, A. Gomariz, M. Campos and R. Thomas, “Fast vertical mining of sequential patterns using co-occurrence information,” The 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 40–52, 2014.
[12]E. Z. Guan, X. Y Chang, Z. Wang and C. G. Zhou, “Mining maximal sequential patterns,” The Second International Conference on Neural Networks and Brain, pp. 525-528, 2005.
[13]C. Gao, J. Wang, Y. He and L. Zhou, “Efficient mining of frequent sequence generators,” The 17th International Conference on the World Wide Web, pp. 1051-1052, 2008.
[14]J. Han, W. Gong and Y. Yin, “Mining segment-wise periodic patterns in time-related databases,” The Fourth International Conference on Knowledge Discovery and Data Mining, pp. 214-218, 1998.
[15]J. Han, G. Dong and Y. Yin, “Efficient mining of partial periodic patterns in time series databases,” The 15th International Conference on Data Engineering, pp. 106-115, 1999.
[16]T. P. Hong, K. Y. Lin and S. L. Wang, “Mining fuzzy sequential patterns from multiple-items transactions,” The Ninth IFSA World Congress and The 20th NAFIPS International Conference, pp. 1317-1321, 2001.
[17]J. Han, J. Pei, Y. Ying and R. Mao, “Mining frequent patterns without candidate generation: a frequent-pattern tree approach,” Data Mining and Knowledge Discovery, vol. 8, pp. 53-87, 2004.
[18]K. Y. Huang and C. H. Chang, “Mining periodic patterns in sequence data,” The International Conference on Data Warehousing and Knowledge Discovery, pp. 401-410, 2004.
[19]Y. Liu, W. Liao and A. Choudhary, “A two-phase algorithm for fast discovery of high utility itemsets,” The Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 689-695, 2005.
[20]C. W. Lin, G. C. Lan and T. P. Hong, “An incremental mining algorithm for high utility itemsets,” Expert Systems with Applications, vol. 39, pp. 7173-7180, 2012.
[21]G. C. Lan, T. P. Hong and V. S. Tseng, “An efficient gradual pruning technique for utility mining,” International Journal of Innovative Computing, Information and Control, vol. 8, pp. 5165-5178, 2012.
[22]G. C. Lan, T. P. Hong, V. S. Tseng and S. L. Wang, “Applying the maximum utility measure in high utility sequential pattern mining,” Expert Systems with Applications, vol. 41, pp. 5071-5081, 2014.
[23]G. C. Lan, T. P. Hong and V. S. Tseng, “An efficient projection-based indexing approach for mining high utility itemsets,” Knowledge and Information Systems, vol. 38, pp. 85-107, 2014.
[24]H. Mannila, H. Toivonen and A. I. Verkamo, “Discovering frequent episodes in sequences,” The First International Conference on Knowledge Discovery and Data Mining, pp. 210-215, 1995.
[25]M. Muzammal and R. Raman, “On probabilistic models for uncertain sequential pattern mining,” The International Conference on Advanced Data Mining and Applications, pp. 60-72, 2010.
[26]M. A. Nishi, C. F. Ahmed, Md. Samiullah and B. S. Jeong, “Effective periodic pattern mining in time series databases,” Expert Systems with Applications, vol. 40, pp. 3015-3027, 2013.
[27]B. Özden, S. Ramaswamy and A. Silberschatz. “Cyclic association rules,” The 14th International Conference on Data Engineering, pp. 412-421, 1998.
[28]J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal and M. C. Hsu, “Mining sequential patterns by pattern-growth: The prefixspan approach,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 11, pp. 1424-1440, 2004.
[29]T. T. Pham, J. Luo, T. P. Hong and B. Vo, “MSGPs: a novel algorithm for mining sequential generator patterns,” The Fourth International Conference on Computational Collective Intelligence, pp. 393-401, 2012.
[30]Y. Pokou, P. Fournier-Viger and C. Moghrabi, “Authorship attribution using small sets of frequent part-of-speech skip-grams,” The International Florida Artificial Intelligence Research Society Conference, pp. 86-91, 2016.
[31]R. Srikant and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements,” The International Conference on Extending Database Technology, pp. 1-17, 1996.
[32]S. K. Tanbeer, C. F. Ahmed, B. S. Jeong and Y. K. Lee, “Discovering periodic-frequent patterns in transactional databases,” The 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 242-253, 2009.
[33]V. S. Tseng, C. W. Wu, B. E. Shie and P. S. Yu, “UP-Growth: an efficient algorithm for high utility itemset mining,” The ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253-262, 2010.
[34]J. Wang, J. Han and C. Li, “Frequent closed sequence mining without candidate maintenance,” IEEE Transactions on Knowledge Data Engineering, vol. 19, pp. 1042-1056, 2007.
[35]X. Yan, J. Han and R. Afshar, “Clospan: mining closed sequential patterns in large datasets,” The 2003 SIAM International Conference on Data Mining, pp. 166-177, 2003.
[36]H. Yao, H. J. Hamilton and C. J. Butz, “A foundational approach to mining itemset utilities from databases,” The 2004 SIAM International Conference on Data Mining, pp. 482-486, 2004.
[37]U. Yun and J. J. Leggett, “WSpan: weighted sequential pattern mining in large sequence databases,” IEEE International Conference on Intelligent Systems, pp. 512-517, 2006.
[38]K. J. Yang, T. P. Hong, Y. M. Chen and G. C. Lan, “Projection-based partial periodic pattern mining for event sequences.” Expert Systems with Applications, vol. 40, pp. 4232-4240, 2013.
[39]U. Yun, G. Lee and E. Yoon, “Efficient high utility pattern mining for establishing manufacturing plans with sliding window control,” IEEE Transactions on Industrial Electronics, vol. 64, pp. 7239-7249, 2017.
[40]M. J. Zaki, “SPADE: an efficient algorithm for mining frequent sequences,” Machine Learning, vol. 42, pp. 31-60, 2001.
[41]S. Ziebarth, I. A. Chounta and H. U. Hoppe, “Resource access patterns in exam preparation activities,” The 10th European Conference on Technology Enhanced Learning, pp. 497-502, 2015.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top