跳到主要內容

臺灣博碩士論文加值系統

(44.210.83.132) 您好!臺灣時間:2024/05/29 14:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:王敏姿
研究生(外文):MinTzu Wang
論文名稱:以交易資料與產品分類樹進行市場區隔之研究
論文名稱(外文):Segmenting Customers with Quantitative Transactions Annotated with Unbalanced Hierarchies
指導教授:許秉瑜許秉瑜引用關係
指導教授(外文):Ping-Yu Hsu
學位類別:博士
校院名稱:國立中央大學
系所名稱:企業管理學系
學門:商業及管理學門
學類:企業管理學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:英文
論文頁數:88
中文關鍵詞:購買數量資料挖掘群集分析相似度度量階層架構
外文關鍵詞:data miningquantitative clusteringsimilarity (distance) measureconcept hierarchy
相關次數:
  • 被引用被引用:0
  • 點閱點閱:200
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
市場區隔是企業根據客戶所需要的產品或服務,將市場區割成不同群組,
使得每個群組內之消費者均有類似需求或購買行為。對個別區隔市場設計專屬的行銷組合,更能有效運用企業的行銷資源,企業更能做好行銷工作。一個合適的市場行銷策略,可以應用到各個領域的客戶,以增加銷售量。這個過程可由四個活動,即市場區隔的顧客辨識,了解該市場區隔的顧客輪廓及特徵,評價該市場區隔的顧客和目標市場的策略及資源配置。市場區隔通常採用集群技術,使得每個群組內之消費者均有類似購買行為、需求或喜好。這種資訊可幫助企業做選擇性營銷的決策過程,如目錄設計,交叉銷售,或安排他們的貨架空間等可能會導致銷售量的增加。增加銷售不僅只是削減價格,若根據客戶需要和喜好,提供一定的數量之熱門項目的混搭銷售,也是一種受到客戶高度讚賞的促銷方式。類似品項及其不同數量的組合包裝是行銷重要策略之一,例如可從一組合為2瓶A牌酒及2包B牌菸或另一組合為2瓶C牌酒及2包D牌菸去選購,符合消費者多樣化需求。但是以往的促銷組合包裝通常來自設計者喜好或觀念,如何從日常大量交易中自動有效得到受顧客歡迎的類似品項及其不同數量的組合包裝是本研究主要課題。
但分析連鎖零售商店擷取的交易資料集是一大挑戰,因為經常包含稀疏的數據,係因交易資料由大量的產品組成,每筆交易包含極少量產品項目,類似稀疏資料集,導致每個項目只出現於小部分的交易資料。以購物籃技術分析的結果,通常得到極端低項目支持度。
資料關聯技術之購物籃分析大都以品項來區別哪些產品客戶會一起購買,有
時因忽略物件階層式架構關係產生相似度的問題,而嘗試將品項提升至類別來增加支持度而產生關聯規則,亦不考慮購買數量,但如此亦失去了可能更明細的關聯品項的決策性資訊。而資料分群技術主要功能則可利用成員的相似度將群集間之差異及群內相似性找出來。
但大部分的研究專注在分群技術及相似度衡量上,忽略了物件間階層式架構
的概念及品項購買數量問題,例如實務上從最終消費者的觀點來看產品或物件相似度,其實是有不同意義的,而且在零售業眾多的產品類別中,實務上也幾乎不可能為平衡式之階層架構。
固本研究提出一個架構來衡量非平衡式之階層架構物件之相似度,從消費者觀點出發,貼近人類直覺,並考慮品項購買數量問題,據以分群,協助分群意義具體化,對決策者提供有用的資訊。
Market segmentation is an important marketing process for enterprises to identify and group customers according to the products or services they need so that suitable market stimuli can be applied to each individual segment of customers to increase sales volumes. The process is composed of four activities, namely, segment identification, segment characterization, segment evaluation and target segment evaluation. Segment identification is usually performed with clustering, which groups customers with similar transactions to the same segment. Such information can lead to increased sales by helping retailers do selective marketing and can help in many business decision-making processes, such as catalog design, cross-marketing and arrange their shelf space. A mixture of popular items with certain quantities in a package according to needs and preferences from customer is a highly appreciated promotion approach to increase sales instead of just cutting price. What popular items and how many quantities supposed to be packed together should be emerged from customer buying behaviors instead of from designers’ perspective.
Performing segment identification from transaction data is difficult because a typical retailer usually carries tens of thousands of goods whereas a transaction typically contains less than a hundred items. Besides the goods that are purchased, the quantities that are consumed also play an important role in distinguishing customers. To reduce the issues of low cardinality and high intra distancing of transaction clustering, the majority of shopping basket analysis attempted to import a product hierarchy to a higher concept level, or aggregate transactions to a customer level to alleviate the sparcity from transactions. However, among the enormous volume of retail products, it is almost impossible to have a balanced hierarchical structure. In empirical practices, similarity in the hierarchy from the perspective of consumers (bottom-up) is also quite different from designers (top-down). Aggregate transactions to a higher concept level may also lose some detail information.
Previous studies have seldom been found to have applied a combined quantity and similarity concept from a bottom-up perspective on an unbalanced tree to clustering transactions. This study presents an algorithm for mining quantitative similar clusters via an improved clustering algorithm which tracks top k clusters with its own quality of intra-similarity. This algorithm is based on a QSKM (Quantity Sensitive kth matched similar pair) similarity measures which we derived for transactions with purchased quantities using an unbalanced hierarchical structure from a consumer’s perspective. From our experiments, we found that QSKM measures outweighed traditional similarity measures in finding the clusters of similar products and quantities purchased together from a real-life transaction database, and also discovered up to 5 clusters with enough coverage from sparse data which cannot be discovered by using traditional similarity measure or frequent patterns. Besides, the cluster intra-similarities are better than GCS (General Cosine Similarity).
Table Contents
摘 要 I
ABSTRACT III
誌 謝 V
CHAPTER 1 INTRODUCTION 1
1.1 FRAMEWORK OF THE PROPOSED MODEL 8
1.2 ORGANIZATION OF THE DISSERTATION 8
CHAPTER 2 LITERATURE REVIEW 10
2.1 MARKET SEGMENTATION AND TRANSACTIONS SPARSITY 10
2.2 QUANTITATIVE FREQUENT PATTERN MINING AND CONCEPT HIERARCHY 12
2.3 SIMILARITY AND CLUSTERING 15
CHAPTER 3 AN EXPLORATORY SIMILARITY MEASURE OF CLUSTERING TRANSACTIONS WITH AN UNBALANCED HIERARCHICAL PRODUCT STRUCTURE 23
3.1 RESEARCH PROBLEM 23
3.2 PROBLEM DEFINITION 25
3.2.1 Unbalanced Hierarchy 25
3.2.2 Computing Distances on Unbalanced Hierarchy 26
3.3 ALGORITHM OF COMPUTING TRANSACTION DISTANCE WITH AN UNBALANCED HIERARCHY 35
3.4 EXPERIMENTAL RESULTS 37
3.4.1 Data Description and Preparation 37
3.4.2 Comparisons of Skew Concept Hierarchy with Different Distance Measures 38
3.5 SUMMARY AND MANAGERIAL IMPLICATIONS 42
CHAPTER 4 SEGMENTING CUSTOMERS WITH QUANTITATIVE TRANSACTIONS ANNOTATED WITH UNBALANCED HIERARCHIES 43
4.1 RESEARCH PROBLEM 43
4.2 PROBLEM DEFINITION 46
4.3 METHODOLOGY 48
4.3.1 Algorithm of QSKM (Quantity Sensitive kth matched similar pair) Similarity 48
4.3.2 Algorithm of Top k Clustering 50
4.4 EXPERIMENTAL RESULTS 53
4.4.1 Data Description and Preparation 53
4.4.2 Model Observation and Comparisons of different distance measures with purchased quantity 55
4.5 SUMMARY AND MANAGERIAL IMPLICATIONS 66
4.5.1 Specific Cluster Observation 66
4.5.2 Summary 67
CHAPTER 5 CONCLUSIONS AND FUTURE WORKS 69
REFERENCES 72
References
[1]D. A. Aaker. Strategic market management. New York: John Wiley & Son, (2001).
[2]G. Adomavicius and A. Tuzhilin. “Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions”. IEEE Transactions on Knowledge and Data Engineering 17(6) (2005) 734-749.
[3]R. Agarwal, C. Aggarwal and V.V.V. Prasad. “A Tree Projection Algorithm for Generation of Frequent Item Sets”, J. Parallel Distributed Comput. 61 (3) (2001) 350–371.
[4]R. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan. “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”, SIGMOD''98 (1998).
[5]R. Agrawal, T. Imielinski and A. Swami. “Mining Association Rules Between Sets of Items in Large Databases”, Proceedings of the ACM-SIGMOD international conference on management of data (SIGMOD’93), Washington, DC, (1993) 207–216.
[6]R. Agrawal and R. Srikant. “Fast Algorithms for Mining Association Rules”, Proc. of the 20th Int''l Conference on Very Large Databases. Santiago, Chile (1994).
[7]R. Agrawal and R. Srikant. “Mining Sequential Patterns”, in: Proceedings of the 11th International Conference on Data Engineering (ICDE), IEEE Press, New York, (1995) 3–14.
[8]C. Anderson and J. W. Vincze. Strategic marketing management, New York: Houghton Mifflin, (2000).
[9]G. H. Ball and D. J. Hall. A Novel Technique for Data Analysis and Pattern Classification. Menlo Park, CA, Standford, Res. Inst. (1965).
[10]Berson, S. Smith and K. Thearling. Building data mining applications for CRM, New York: McGraw-Hill, (2000).
[11]S. Brin, R Motwani and C. Silverstein. “Beyond Market Basket: Generalizing Association Rules to Correlations. In: Proceeding of the 1997 ACM-SIGMOD International conference on management of data (SIGMOD’97), Tucson, AZ, (1997) 265–276.
[12]S. Chen, J. Han and P. S. Yu. “Data Mining: An Overview from a Database Perspective”, IEEE Transactions on Knowledge and Data Engineering, 8(6) (1996) 866-883.
[13]Y. L. Chen, J. M. Chen and C. W. Tung. “A data mining approach for retail knowledge discovery with consideration of the effect of shelf-space adjacency on sales”. Decision Support Systems 42 (2006) 1503–1520.
[14]J. Cheng, Y. Ke and W. Ng. “Effective Elimination of Redundant Association Rules”, Data Min Knowl Disc 16 (2008) 221–249.
[15]M. J. Croft. Market segmentation: A step-by-step guide to profitable new business, London, New York: Routledge. (1994).
[16]R. G. Drozdenko and P. D. Drake. Optimal database marketing: Strategy, development, and data mining, London: Sage. (2002).
[17]M. Ester, H. P. Kriegel, J. Sander and X. Xu. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases”, KDD''96 (1996).
[18]P. Ganesan, H. Garcia-Molina and J. Widom. “Exploiting Hierarchical Domain Structure to Compute Similarity”, ACM Transactions on Information Systems, 21 (1) (2003/1/1).
[19]D. Golsberg, D. Nichols, B. M. Oki and D. Terry. “Using Collaborative Filtering to Weave Information Tapestry”, Commun. ACM 35 (12) (1992) 61–70.
[20]G. Grahne and J. Zhu. “Efficiently Using Prefix-Trees in Mining Frequent Itemsets”. In: Proceeding of the ICDM’03 international workshop on frequent itemset mining implementations (FIMI’03), Melbourne, FL, (2003) 123–132.
[21]S. Guha, R. Rastogi and K. Shim. “ROCK: A Robust Clustering Algorithm for Categorical Attributes”, In ICDE''99, Sydney, Australia, (March 1999) 512-521.
[22]K. Hammond, A. S. C. Ehrenberg and G. J. Goodhardt. “Market segmentation for competitive brands”, European Journal of Marketing, 30(12) (1996) 39–49.
[23]J. Han, Y. Cai and N. Cercone. “Knowledge Discovery in Databases: An Attribute-Oriented Approach”, VLDB (1992) 547-559.
[24]J. Han, H. Cheng, D. Xin and X. Yan, “Frequent Pattern Mining: Current Status and Future Directions”, Data Min Knowl Disc 15 (2007) 55–86.
[25]J. Han and Y. Fu. “Discovery of Multiple-Level Association Rules from Large Databases”. In: Proceeding of the 1995 International conference on very large data bases (VLDB’95), Zurich, Switzerland, (1995) 420–431.
[26]J. Han and Y. Fu. “Mining Multiple-Level Association Rules in Large Databases”, IEEE Transactions on Knowledge and Data Engineering, 11(5) (1999) 798-805.
[27]J. Han and M. Kamber. Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann (2006).
[28]J. Han, J. Pei and Y. Yin. “Mining Frequent Patterns without Candidate Generation”, in: Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, TX, (2000) 1–12.
[29]J. Herlocker, J. Konstan, A. Borcher and J. Riedl. “An Algorithmic Framework for Performing Collaborative Filtering”, Proceedings of the 1999 Conference on Research and Development in Information Retrieval (1999).
[30]K. Jain, M. N. Murthy and P. J. Flynn. “Data Clustering: a review”, ACM Computing Reviews, 31(3) (1999) 264–323.
[31]M. Kantardzic. Data Mining: Concepts, Models, Methods, and Algorithms, John Wiley (2002).
[32]G. Karypis, E. H. Han and V. Kumar. “CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling”, COMPUTER, 32(8) (1999) 68-75.
[33]L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis, John Wiley & Sons, (1990).
[34]T. Kohonen. Self-organizing maps. Secaucus, NJ: Springer-Verlag New York, Inc. (1997).
[35]J. B. Kruskal. Multidimensional Scaling and Other Methods for Discovering Structure. In K. Enslein, A. Ralston, & H. S. Wilf (Eds.), Statistical methods for digital computers. New York: Wiley. (1977) 296–339.
[36]H. H. Liu and C. S. Ong. “Variable Selection in Clustering for Marketing Segmentation Using Genetic Algorithms”, Expert Systems with Applications 34 (2008) 502–510.
[37]M. J. McGill. Introduction to Modern Information Retrieval, McGraw-Hill (1983).
[38]J. H. Myers. Segmentation and positioning for strategic marketing decisions, Chicago: American Marketing Association. (1996).
[39]R. Ng and J. Han. “Efficient and Effective Clustering Method for Spatial Data Mining”. VLDB''94 (1994).
[40]G. K. Palshikar, M. S. Kale and M. M. Apte. “Association Rules Mining Using Heavy Itemsets”, Data & Knowledge Engineering 61 (2007) 93–113.
[41]D. Peppers, M. Rogers and B. Dorf. "Is your Company Ready for One-to-One Marketing?", Harvard Business review - January-February (1999) 151- 160.
[42]G. Piatetsky-Shapiro. Knowledge Discovery in Databases, AAAI/MIT Press, Anaheim, CA, (1991).
[43]G. Salton and C. Buckley. “Term-Weighting Approaches in Automatic Text Retrieval”, Inf. Process. Manage. 24(5) (1988) 513–523.
[44]R. Srikant and R. Agrawal. “Mining Generalized Association Rules”, In Proceedings of VLDB ’95, (1995) 407–419.
[45]J.-B.E.M. Steenkamp and F. T. Hofstede. “International market segmentation: Issues and perspectives”. International Journal of Research in Marketing, 19(3) (2002) 185– 213.
[46]P. N. Tan, V. Kumar and J. Srivastava “Selecting the Right Objective Measure for Association Analysis”, Information Systems 29 (2004) 293–313.
[47]P. N. Tan, M. Steinbach and V. Kumar. Introduction to Data Mining, Pearson International Edition (2005).
[48]C. Y. Tsai and C. C. Chiu. “A Purchase-Based Market Segmentation Methodology”, Expert Systems with Applications 27 (2004) 265–276.
[49]P. S. M. Tsai and C. M. Chen. “Mining Quantitative Association Rules in a Large Database of Sales Transactions”, Journal of Information Science and Engineering 17, (2001) 667-681.
[50]J. Wang and G. Karypis. “HARMONY: Efficiently Mining the Best Rules for Classification”. In: Proceeding of the 2005SIAM conference on data mining (SDM’05), Newport Beach, CA, (2005) 205–216.
[51]M. T. Wang, P. Y. Hsu, K. C. Lin and S. S. Chen. “Clustering Transactions with an Unbalanced Hierarchical Product Structure”, LNCS 4654, (2007)251–261.
[52]S. J. Yen and Y. S. Lee. “Mining High Utility Quantitative Association Rules”, LNCS 4654 (2007) 283–292.
[53]M. J. Zaki. “Fast Mining of Sequential Patterns in Very Large Databases”, Technical Report 668, Department of Computer Science, University of Rochester, (1997).
[54]M. J. Zaki. “Scalable Algorithms for Association Mining”. IEEE Trans Knowl Data Eng 12 (2000) 372–390.
[55]M. J. Zaki and K. Gouda. “Fast Vertical Mining Using Diffsets”, Technical Report 01-1, Department of Computer Science, Rensselaer Polytechnic Institute (2001).
[56]M. J. Zaki and C. J. Hsiao. “CHARM: An Efficient Algorithm for Closed Itemset Mining”. In: Proceeding of the 2002 SIAM international conference on data mining (SDM’02), Arlington, VA, (2002) 457–473.
[57]T. Zhang, R. Ramakrishnan and M. Livny. “BIRCH : An Efficient Data Clustering Method for Very Large Databases”, SIGMOD (1996).
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top