臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.217.165) 您好！臺灣時間：2026/05/20 07:14

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

張啟原

研究生(外文):

Chang, Chi-Yuan

論文名稱:

期間限制探勘於高效用序列樣式

論文名稱(外文):

Mining High Utility Sequential Patterns with Duration Constraints

指導教授:

胡雅涵

指導教授(外文):

Hu, Ya-Han

口試委員:

許巍嚴、翁政雄

口試委員(外文):

Hsu, Wei-Yen、Weng, Cheng-Hsiung

口試日期:

2014-10-03

學位類別:

碩士

校院名稱:

國立中正大學

系所名稱:

資訊管理學系暨研究所

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2014

畢業學年度:

103

語文別:

英文

論文頁數:

中文關鍵詞:

效用序列樣式探勘、Prefixspan演算法、時間限制探勘

外文關鍵詞:

utility sequential pattern mining、Prefixspan、time constraint-based mining

相關次數:

被引用:0
點閱:533
評分:
下載:11
書目收藏:0

高效用序列樣式探勘是資料探勘領域中一種很重要的應用。其中，它被廣泛應用在在購買行為分析的領域。透過高效用序列樣式探勘，企業可以窺探顧客的購買習慣，藉此得知產品之間的關聯性與大多數顧客購買哪些高單價商品，進而制訂銷售方針。
然而，在過去的研究裡，高效用序列樣式探勘出的樣式，會有顧客購買時間很長的樣式，在購買的關係上意義不大，為了要找出較具意義的樣式，我們針對要找尋的樣式加入時間條件的篩選。
在本篇中，加入期間條件的限制(duration constraint)與間隔的限制(gap constraint)，在期間限制方面提出了最大跨度區間(maximum span length)，期望找出來的樣式是在特定的一段時間內發生，在間隔的限制方面提出了最大間隔(maxgap)與最小間隔(mingap)。本研究提出HUD演算法，整合時間限制並調整Prefixspan演算法做效用序列樣式探勘。在實驗中測試執行時間、樣式數量、單一樣式平均價值、查準率、查全率、F測量等指標來比較我們的方法與傳統方法的差異。

Utility Sequential pattern mining (utility SPM) is one of most important data mining technique, and it is widely used in customer behavior scenario. Organizations are able to explore customers’purchase habit and comprehend the relationship between merchandise and high-priced merchandise which most customers buy through utility SPM process to develop sales policy.
However, in previous studies, the pattern in conventional utility SPM, if the average length of sequences in database is long, the algorithm often generate too many long sequential patterns. It is not meaningful for relationship purchased. In order to find the more meaningful the style, we add the time constraints for the pattern mining.
In this paper, we include the duration constraints in utility SPM. Specifically, we propose maximum span length constraint that expect to find out patterns which are occurring within a specific period of time. Next, the maxgap and mingap constraints is used to confine the reasonable time-interval between adjacent events. A new framework High Utility sequential pattern mining with Duration constraints (HUD) algorithm to mine high utility sequential patterns by the integration constraints. In experiment, we test runtime, number of patterns, value per pattern, precision, recall, F-measure to compare performance between our method and traditional utility SPM.

1.Introduction..............................1
1.1 Background..............................1
1.2 Motivation..............................2
1.3 Organization............................4
2.Related Work..............................5
2.1 Utility Pattern Mining (UPM)............5
2.2 Utility SPM.............................7
2.3 Constraint-based SPM....................8
3.Problem Definition.......................11
3.1 Problem Definition.....................11
4.The HUD-PrefixSpan Algorithm.............17
4.1 Find 1-length-HUD-SPs..................21
4.2 Divide and search......................21
4.3 Find subsets of sequentia lpatterns....22
5. Experiment Design.......................25
5.1 Real-life Datasets.....................25
5.2 Experiment Evaluation..................25
6. Conclusion and future works.............34
Reference..................................35

Agrawal, Rakesh, & Srikant, Ramakrishnan. (1995). Mining sequential patterns. Paper presented at the Data Engineering, 1995. Proceedings of the Eleventh International Conference on.
Ahmed, Chowdhury Farhan, Tanbeer, Syed Khairuzzaman, & Jeong, Byeong-Soo. (2010). A novel approach for mining high-utility sequential patterns in sequence databases. ETRI journal, 32(5), 676-686.
Ahmed, Chowdhury Farhan, Tanbeer, Syed Khairuzzaman, Jeong, Byeong-Soo, & Lee, Young-Koo. (2009). Efficient tree structures for high utility pattern mining in incremental databases. Knowledge and Data Engineering, IEEE Transactions on, 21(12), 1708-1721.
Ahmed, Chowdhury Farhan, Tanbeer, Syed Khairuzzaman, Jeong, Byeong-Soo, & Lee, Young-Koo. (2011). HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Applied Intelligence, 34(2), 181-198.
Boulvain, Frédéric, Mabille, Cédric, Poulain, Geoffrey, & Da Silva, Anne-Christine. (2009). Towards a palaeogeographical and sequential framework for the Givetian of Belgium. Geologica Belgica, 12.
Chan, Raymond, Yang, Qiang, & Shen, Yi-Dong. (2003). Mining high utility itemsets. Paper presented at the Data Mining, 2003. ICDM 2003. Third IEEE International Conference on.
Chen, Enhong, Cao, Huanhuan, Li, Qing, & Qian, Tieyun. (2008). Efficient strategies for tough aggregate constraint-based sequential pattern mining. Information Sciences, 178(6), 1498-1518.
CHHAJED, MEHZABIN SHAIKHand GYANKAMAL J. (2012). Review on Financial Forecasting using Neural Network and Data Mining Technique.
Chiang, Ding-An, Wang, Cheng-Tzu, Chen, Shao-Ping, & Chen, Chun-Chi. (2009). The cyclic model analysis on sequential patterns. Knowledge and Data Engineering, IEEE Transactions on, 21(11), 1617-1628.
Chu, Chun-Jung, Tseng, Vincent S, & Liang, Tyne. (2009). An efficient algorithm for mining high utility itemsets with negative item values in large databases. Applied Mathematics and Computation, 215(2), 767-778.
Cios, Krzysztof J, Pedrycz, Witold, & Swiniarsk, RM. (1998). Data mining methods for knowledge discovery. Neural Networks, IEEE Transactions on, 9(6), 1533-1534.
Das, Resul, & Turkoglu, Ibrahim. (2009). Creating meaningful data from web logs for improving the impressiveness of a website by using path analysis method. Expert Systems with Applications, 36(3), 6635-6644.
Denoeux, Thierry. (1995). A k-nearest neighbor classification rule based on Dempster-Shafer theory. Systems, Man and Cybernetics, IEEE Transactions on, 25(5), 804-813.
Erwin, Alva, Gopalan, Raj P, & Achuthan, NR. (2007a). A bottom-up projection based algorithm for mining high utility itemsets. Paper presented at the Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining-Volume 84.
Erwin, Alva, Gopalan, Raj P, & Achuthan, NR. (2007b). CTU-Mine: An efficient high utility itemset mining algorithm using the pattern growth approach. Paper presented at the Computer and Information Technology, 2007. CIT 2007. 7th IEEE International Conference on.
Garofalakis, Minos N, Rastogi, Rajeev, & Shim, Kyuseok. (1999). SPIRIT: Sequential pattern mining with regular expression constraints. Paper presented at the VLDB.
Hartigan, John A, & Wong, Manchek A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 100-108.
Hu, Ya-Han, Huang, Tony Cheng-Kui, & Kao, Yu-Hua. (2012). Knowledge discovery of weighted RFM sequential patterns from customer sequence databases. Journal of Systems and Software.
Hu, Ya-Han, Wu, Fan, & Yen, Tzu-Wei. (2010). Considering RFM-values of frequent patterns in transactional databases. Paper presented at the Software Engineering and Data Mining (SEDM), 2010 2nd International Conference on.
Ji, Xiaonan, Bailey, James, & Dong, Guozhu. (2007). Mining minimal distinguishing subsequence patterns with gap constraints. Knowledge and Information Systems, 11(3), 259-286.
Lan, Guo-Cheng, Hong, Tzung-Pei, Tseng, Vincent S, & Wang, Shyue-Liang. (2012). An improved approach for sequential utility pattern mining. Paper presented at the Granular Computing (GrC), 2012 IEEE International Conference on.
Li, Hua-Fu, Huang, Hsin-Yun, & Lee, Suh-Yin. (2011). Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits. Knowledge and information systems, 28(3), 495-522.
Li, Yu-Chiang, Yeh, Jieh-Shan, & Chang, Chin-Chen. (2008). Isolated items discarding strategy for discovering high utility itemsets. Data & Knowledge Engineering, 64(1), 198-217.
Lin, Ming-Yen, Hsueh, Sue-Chen, & Chang, Chia-Wen. (2008). Mining closed sequential patterns with time constraints. Journal of information Science and Engineering, 24(1), 33.
Liu, Bing, Hsu, Wynne, Chen, Shu, & Ma, Yiming. (2000). Analyzing the subjective interestingness of association rules. Intelligent Systems and Their Applications, IEEE, 15(5), 47-55.
Liu, Junqiang, Wang, Ke, & Fung, Benjamin. (2012). Direct Discovery of High Utility Itemsets without Candidate Generation. Paper presented at the Data Mining (ICDM), 2012 IEEE 12th International Conference on.
Liu, Ying, Liao, Wei-keng, & Choudhary, Alok. (2005). A two-phase algorithm for fast discovery of high utility itemsets Advances in Knowledge Discovery and Data Mining (pp. 689-695): Springer.
Mannila, Heikki, Toivonen, Hannu, & Verkamo, A Inkeri. (1997). Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3), 259-289.
Masseglia, Florent, Teisseire, Maguelonne, & Poncelet, Pascal. (2009). Sequential Pattern Mining.
Ngai, Eric WT, Xiu, Li, & Chau, Dorothy CK. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert Systems with Applications, 36(2), 2592-2602.
Pei, Jian, Han, Jiawei, Mortazavi-Asl, Behzad, Wang, Jianyong, Pinto, Helen, Chen, Qiming, . . . Hsu, Mei-Chun. (2004). Mining sequential patterns by pattern-growth: The prefixspan approach. Knowledge and Data Engineering, IEEE Transactions on, 16(11), 1424-1440.
Pei, Jian, Han, Jiawei, & Wang, Wei. (2002). Mining sequential patterns with constraints in large databases. Paper presented at the Proceedings of the eleventh international conference on Information and knowledge management.
Pei, Jian, Han, Jiawei, & Wang, Wei. (2007). Constraint-based sequential pattern mining: the pattern-growth methods. Journal of Intelligent Information Systems, 28(2), 133-160.
Powers, David Martin. (2011). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation.
Radhakrishna, Vangipuram, Srinivas, Chintakindi, & Guru Rao, CV. (2013). Constraint based Sequential Pattern Mining in Time Series Databases-A Two Way Approach. AASRI Procedia, 4, 313-318.
Sandhu, Parvinder S, Dhaliwal, Dalvinder S, Panda, SN, & Bisht, Atul. (2010). An Improvement in Apriori Algorithm Using Profit and Quantity. Paper presented at the Computer and Network Technology (ICCNT), 2010 Second International Conference on.
Seno, Masakazu, & Karypis, George. (2002). Slpminer: An algorithm for finding frequent sequential patterns using length-decreasing support constraint. Paper presented at the Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on.
Seno, Masakazu, & Karypis, George. (2005). Finding frequent patterns using length-decreasing support constraints. Data Mining and Knowledge Discovery, 10(3), 197-228.
Sharma, Heerash Kumar, & Partheria, Chandra Bhan. (2013). Improved High Utility Mining Algorithm.
Tseng, Vincent S, Wu, Cheng-Wei, Shie, Bai-En, & Yu, Philip S. (2010). UP-Growth: an efficient algorithm for high utility itemset mining. Paper presented at the Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining.
Wang, Jing, Liu, Ying, Zhou, Lin, Shi, Yong, & Zhu, Xingquan. (2007). Pushing frequency constraint to utility mining model Computational Science–ICCS 2007 (pp. 685-692): Springer.
Wu, Cheng-Wei, Lin, Yu-Feng, Philip, S Yu, & Tseng, Vincent S. (2012). Mining High Utility Episodes in Complex Event Sequences.
Wu, Cheng Wei, Shie, Bai-En, Tseng, Vincent S, & Yu, Philip S. (2012). Mining top-K high utility itemsets. Paper presented at the Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining.
Yao, Hong, & Hamilton, Howard J. (2006). Mining itemset utilities from transaction databases. Data & Knowledge Engineering, 59(3), 603-626.
Yao, Hong, Hamilton, Howard J, & Butz, Cory J. (2004). A foundational approach to mining itemset utilities from databases. Paper presented at the The 4th SIAM international conference on data mining.
Yi, Shengwei, Zhang, Yuanyuan, Zhao, Tianheng, Ma, Shilong, Yin, Jie, Sun, Hejie, & Chen, Xin. (2012). Efficient Sequential Generator Discovery Over Stream Sliding Windows. Advanced Science Letters, 11(1), 437-442.
Yin, Junfu, Zheng, Zhigang, & Cao, Longbing. (2012). USpan: An efficient algorithm for mining high utility sequential patterns. Paper presented at the Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining.
Yun, Unil. (2008). An efficient mining of weighted frequent patterns with length decreasing support constraints. Knowledge-Based Systems, 21(8), 741-752.
Zhang, Xiaolong, Gong, Wenjuan, & Kawamura, Yoshihiro. (2004). Customer behavior pattern discovering with web mining Advanced Web Technologies and Applications (pp. 844-853): Springer.
Zheng, Kai, Padman, Rema, & Johnson, Michael P. (2007). User interface optimization for an electronic medical record system. Studies in health technology and informatics, 129(2), 1058.

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	利用文獻資訊提昇以文件主題為基礎之個人化推薦系統
2.	A Hybrid Software Architecture Analysis Method for Software Development and Evaluation
3.	以相容性的觀點探討消費者對電子書之使用意圖
4.	應用期望確認理論探討個人持續使用雲端服務之意圖
5.	食道癌細胞的光療與檢測
6.	RGB LED混光機構應用於口腔照明之研究
7.	克服前景深度邊緣誤差之深度繪圖法新視角合成影像品質改善法
8.	分切合整數位控制三相四線式轉換器
9.	基於 3D-HEVC 的立體視訊快速編碼法
10.	中國音樂選秀節目的電視模式本土化與營銷創新之探析:以《中國好聲音》為例
11.	建構序列分類方法於預測慢性腎臟病病程惡化之研究
12.	以動態能力觀點探討企業多角化策略形成之研究-以東南旅行社為例
13.	組織因子途徑抑制素第二型在口腔鱗狀細胞癌中是一個經常受表觀遺傳調控而靜默的抑癌基因
14.	裝置間通訊基於調變編碼選擇及節能資源分配機制
15.	閉迴路式高度整合無線CMOS金屬鎊線加速度計晶片設計

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室