跳到主要內容

臺灣博碩士論文加值系統

(54.224.117.125) 您好!臺灣時間:2022/01/23 19:59
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:翁瑞鋒
研究生(外文):Jui-Feng Weng
論文名稱:網頁瀏覽者行為之泛化分群分析
論文名稱(外文):Generalized Clustering for Web User''s Behavior Mining
指導教授:曾憲雄曾憲雄引用關係
指導教授(外文):Shian-Shyong Tseng
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊科學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:英文
論文頁數:60
中文關鍵詞:分群法概念式階層架構資料探勘泛化特徵點網路行為探勘
外文關鍵詞:cluster analysisconcept hierarchydata miningfeature generalizationweb usage mining
相關次數:
  • 被引用被引用:3
  • 點閱點閱:1419
  • 評分評分:
  • 下載下載:324
  • 收藏至我的研究室書目清單書目收藏:6
網路使用者行為探勘,是將資料探勘的技術,使用在分析網路使用者的行為上面。透過分群分析的技術,可以將相似的使用者行為聚集在同一群,藉以分析此行為之特性。但是由於現今的網站往往有成千上萬的網頁,所以會造成分群分析時沒有效率。為了解決這個問題,我們提出了結合網頁的概念式階層架構,並使用階層式特徵選取技術的泛化分群系統。分析者可以透過此系統,來選擇不同泛化階層的行為特徵,去做分群分析。為了能有效率的實做階層式特徵選取功能,我們提出了階層關係內嵌式索引技術,將網頁的概念式階層關係編譯到索引的編碼中,透過索引的編碼即可知道泛化特徵間的階層關係。最後透過我們的實驗,顯示了透過階層式特徵選取的泛化分群功能,可以幫助分析者得到更有意義的使用者行為族群。
Web Usage Mining is the process of applying data mining technique to web data in order to discover the access patterns of web users. Cluster analysis of the Web Usage Mining can group the users’ behaviors into clusters which have common characteristics. However, the huge amount of web pages may cause some inefficiency issues. To solve the problems, a Generalized Clustering System with Hierarchical Feature Selection Technique based on the given web pages concept hierarchy is proposed. In the system, the analyst can select the appropriate levels of generalized features to apply clustering analysis. A Hierarchy Embedded Indexing Technique is proposed to enhance the Hierarchical Feature Selection mechanism. It encodes the concept hierarchy relations into the index codes. Our experiments also show that with the Hierarchical Feature Selection Technique in generalized clustering process can help the analyst obtain more meaningful characteristics of users’ behavior groups.
Abstract (In Chinese) I
Abstract II
Acknowledgement (In Chinese) III
Contents IV
List of Figures VI
List of Algorithms VII
List of Tables VIII
Chapter 1 Introduction 1
Chapter 2 Related Work 4
2.1 The Cluster Analysis Algorithm 4
2.2 Web Usage Mining 7
Chapter 3 Architecture of Generalized Clustering System 11
3.1 Hierarchical Feature Selection with Web Pages Concept Hierarchy 14
3.2 Preprocessing Phase 17
3.2.1 Web Pages Concept Hierarchy Construction 18
3.2.2 Concept Hierarchy Indexing Process 19
3.2.3 Transaction Identification 20
3.2.4 User’s Behavior Vector Set Construction 22
3.3 Generalized Clustering Phase 23
3.3.1 Hierarchical Feature Selection Operations 23
3.3.2 The Generalized Clustering Analysis 25
3.4 Clustering Analyzing Phase 27
Chapter 4 Modeling the Generalized Clustering System 29
4.1 Hierarchy Embedded Indexing For Concept Hierarchy 29
4.2 User’s behavior Vector Set Construction 32
4.2.1 The User’s behavior Vector Set Construction Algorithm 32
4.2.2 Generalized Feature Vectors Construction 35
4.2.3 Complexity Analysis 37
4.3 Distance Measure of User’s behavior Vector 38
4.4 Operations for Hierarchical Feature Selection 38
4.5 Clustering Evaluation Functions 39
Chapter 5 Experiments 42
5.1 Experimental Environment 42
5.2 Example 43
5.3 Accuracy Analysis 46
Chapter 6 Concluding Remarks 48
Bibliography 50
[1] S. Bandyopadhyay and U. Maulik, “Nonparametric genetic clustering: comparison of validity indices,” Machine Intelligence Unit, Indian Stat. Inst., Calcutta, India. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, pp. 120 — 125, Feb. 2001.
[2] J. C. Bezdek and N. R. Pal, “Some new indexes of cluster validity,” IEEE Transactions on Systems, Man and Cybernetics, Part B, Vol. 28, Issue 3 , pp. 301 —315, June 1998.
[3] R. Cooley, B. Mobasher, and J. Srivastava, “Grouping web page references into transactions for mining world wide web browsing patterns,” the 1997 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX-97), Proceeding, November 1997.
[4] R. Cooley, B. Mobasher, and J. Srivastava, “Data preparation for mining world wide web browsing patterns,” Knowledge and Information Systems Vol. 1, 1999.
[5] Y. Fu, K. Sandhu, and M. Shih, “Clustering of web users based on access patterns,” International Workshop on Web Usage Analysis and User Profiling (WEBKDD''99), San Diego, CA, 1999.
[6] D. J. Hall and G. B. Ball. “ISODATA: a novel method of data analysis and pattern classification,” Technical report, Stanford Research Institute, Menlo Park CA, 1965.
[7] J. Han and M. Kamber, “Data mining concepts and techniques,” Morgan Kaufmann Publishers 2001.
[8] J. Han and Y. Fu, “Discovery of multiple-level association rules from large databases,” The International Conference on Very Large Databases, pp. 420 — 431, 1995.
[9] A. Hotho, A. Maedche and S. Staab, “Text clustering based on good aggregations,” IEEE International Conference on Data Mining, ICDM 2001 Proceedings, pp. 607 — 608, 2001.
[10] J. E. Jackson, “A user’s guide to principal components,” John Wiley & Sons 1991.
[11] A. Maedche and S. Staab, “Ontology learning for the semantic web,” IEEE Intelligent Systems (see also IEEE Expert), Vol. 16, Issue 2, pp. 72 -79, March-April 2001
[12] B. Mobasher, E. H. (Sam) Han, G. Karypis, and V. Kumar, “Hypergraph based clustering in high-dimensional data sets: a summary of results,” IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 21, No. 1, March 1998.
[13] B. Mobasher, E. Han, G. Karypis, and V. Kumar. “Clustering in a high-dimensional space using hypergraph models,” Technical Report, Department of Computer Science, University of Minnesota, 1998.
[14] G. Paliouras, C. Papatheodorou, V. Karkaletsis and C. D. Spyropoulos, “Clustering the users of large web sites into communities,” International Conference on Machine Learning (ICML), Proceedings, pp. 719 - 726, Stanford, CA, 2000.
[15] J. E. Pitkow (1998). “Summary of WWW characterizations,” Web Journal 2(1-2): 3-13.
[16] C. Shahabi, A. M. Zarkesh, J. Adibi and V. Shah, “Knowledge discovery from users web-page navigation,” Seventh International Workshop on Research Issues in Data Engineering, 1997, Proceedings, pp. 20 -29, 1997
[17] R. Srikant and R. Agrawal, “Mining generalized association rules in large rleational tables,” The International Conference on Very Large Databases, pp. 407 — 419, 1995
[18] J. Srivastava, R. Cooley, M. Deshpande and P. N. Tan, “Web usage mining: discovery and applications of usage patterns from web data,” SIGKDD Explorations, Vol. 1, Issue 2, 2000.
[19] N. Wang and X. Xu. “A method to build ontology,“ The Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region, 2000, Proceedings, Vol. 2, pp. 672 - 673, 2000
[20] Sane Solutions, http://www.sane.com/
[21] The Internet Wedding Directory, tinetheknot.com, http://www.tietheknot.com/
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top