跳到主要內容

臺灣博碩士論文加值系統

(3.87.33.97) 您好!臺灣時間:2022/01/27 17:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:許書淵
研究生(外文):Shu-Yuan Hsu
論文名稱:利用FID3於網站登入資料分析
論文名稱(外文):Web Log-File Data Mining using FID3
指導教授:張志永
指導教授(外文):Jyh-Yeong Chang
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電機與控制工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:英文
論文頁數:57
中文關鍵詞:網站登入資料
外文關鍵詞:Data MiningFID3
相關次數:
  • 被引用被引用:1
  • 點閱點閱:277
  • 評分評分:
  • 下載下載:43
  • 收藏至我的研究室書目清單書目收藏:2
本篇論文應用FID3於網站紀錄之資料探勘(data mining)的分析系統。我們選取一個B2C電子商務型網站的連線資料(log-file)和使用者的資料做為資料探勘的資料庫,並且開發一個關於網站商品內容的資料探勘分析系統(Web log-file mining system)。這一個資料探勘分析系統的架構分成三個步驟:第一步為資料準備(data preparation),主要的負責將連線資料和使用者資料由文字模式轉換成資料庫並且除去多餘不必要的資料,第二步為資料引擎(data engine),這一部份為資料探勘的核心,其中包括建立各個資料庫的連結和執行資料探勘演算法,且根據知識庫(knowledge base)所提供的資料來將資料引擎所分析而得的結果轉換成規則(rules)。而第三步為資料分析(data analysis),此一階段是藉由這一套系統所得出的規則可做出兩種應用。第一種,我們可以針對電子商務的業者一些建構網站和維護的依據,以增進電子商務的實質效益。第二種,我們可針對網站登入使用者的背景,來提出針對其個人最佳的路徑,增加使用者於瀏覽此網站的效率。
The goal of data mining process is knowledge discovery. This thesis applies Fuzzy ID3 to develop a web log-file mining system to analyze the log-file and users’ profile of a B2C website. The web log-file mining system can be divided into three components: data preparation, data engine, and data analysis. The function of data preparation is to convert the log-file from text file to ACCESS database and remove all of the redundance in log-file. The data engine, the kernel of data mining process, is designed to combine the metadata of pages, log-file, and users’ profile. Then the combined database can be the input patterns of fuzzy ID3. The third module of web log-file mining system is data analysis and the data analysis is the final procedure of data mining process. Based on the decision tree constructed by fuzzy ID3, the system will build the fuzzy “IF-THEN” rules and extract information from them. According to these fuzzy rules, the system can realize two applications. One is to provide information about the behavior of the user for web master to maintain and promote the web site. One is to suggest the better browsing path to the user who visits the web site.
Chapter 1. Introduction………………………………………………1
1.1. Research Background….………………………………………………….1
1.2. Motivation…………………………………………………………………2
1.3. Thesis Outline……………………………………………………………..3
Chapter 2. Database Management……………………………………..4
2.1. Description of Database…………………………………………………..4
2.2. Database Design…………………………………………………………..6
2.3. Metadata…………………………………………………………………..8
2.4. Data Mining……………………………………………………………….9
Chapter 3. Introduction of FID3……………………………………...11
3.1. Introduction of ID3………………………………………………………11
3.2. Feature Ranking…………………………………………………………13
3.3. Some Concept about Fuzzy Set and Linguistic Variable………………..14
3.3.1. Fuzzy Set Fundamental………………………………………….14
3.3.2. Linguistic Variable………………………………………………17
3.4. Fuzzy ID3 (FID3)………………………………………………………..17
3.5. Fuzzy Inference………………………………………………………….22
Chapter 4. Web Log-file Mining System……………………………..24
4.1. Web Log-file……………………………………………………………..24
4.2. The Web Log-file Mining System……………………………………….26
4.2.1. Data preparation…………………………………………………27
4.2.2. Data Mining Algorithm………………………………………….29
4.2.3. Data Analysis…………………………………………………….31
Chapter 5. Simulation and Experiment……………………………....33
5.1. Database Design…………………………………………………………33
5.2. Data Preparation…………………………………………………………36
5.3. Data Engine……………………………………………………………...38
5.3.1. Consumer Engine………………………………………………..38
5.3.2. Metadata Engine…………………………………………………38
5.4. Data Mining……………………………………………………………...39
5.5. Simulation and Results…………………………………………………..41
5.5.1. Statistic Data……………………………………………………..41
5.5.2. Data Mining Results……………………………………………..44
5.5.3. Testing…………………………………………………………...50
5.5.4. The Results for Web Master……………………………………..51
5.5.5. The Results for Users……………………………………………52
5.5.6. The Results for Feature Ranking………………………………...53
5.6. Summary………………………………………………………………...54
Chapter 6. Conclusion………………………………………………....55
Reference……………………………………………………………….57
[1] A. Luotonen, “The common logfile format,” 1995, http://www.w3.org/pub/WWW/
Daemon/User/Config/Logging.html.
[2] P. M. Hallam-Baker, “Extended log file format.”, http://www.w3.org/pub/
WWW/TR/WD-logfile.html.
[3] L. A. Zadeh, “The concept of a linguistic variable and its application to approximate reasoning,” Information Science, Vol. 9, pp.199-249, 1975.
[4] F. R. McFadden and J. A. Hoffer, Database Management, Benjamin/Cummings, 1991.
[5] P. O’Neil and E. O’Neil, Database: Principles, Programming, and Performance, Morgan Kaufmann, 1994.
[6] J. P. Bigus, Data Mining with Neural Networks: solving business problems from application development to decision support, NY: McGraw-Hill,1996
[7] M. Umanol, H. Okamoto, I. Hatono, H. Tamura, F. Kawachi, S. Umedzu and J. Kinoshita, Fuzzy Decision Trees by Fuzzy ID3 Algorithm and Its Application to Diagnosis Systems, in Proc. of the 1994 Third IEEE Conf. on Fuzzy Systems,1994.
[8] L. A. Zadeh, “Fuzzy sets,” Information Control, Vol. 8, pp. 338-353, 1965.
[9] H. J. Zimmermann, Fuzzy Set Theory and Its Application, Boston, Dordrecht, Landon, 1991.
[10] C. T. Lin and C. S. G. Lee, Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems, Upper Saddle River, New Jersey: Prentice-Hall, 1996.
[11] G. A. Carpenter and S. Grossberg, “A massively parallel architecture for a self-organizing neural pattern recognition machine,” Comput. Vision Graphics Image Process, Vol. 37, pp. 54-115, 1987.
[12] J. R. Quinlan, “Induction of Decision Trees,” Machine Learning, Vol. 1, pp. 81-106, 1986.
[13] N. R. Pal, “Fuzzy Rule Extraction from ID3-Type Decision Trees for Real Data,” IEEE Trans on Systems, Vol. 31, No. 5, pp. 745-754, 2001.
[14] H. Ichihashi, T. Shirai, K. Nagasaka, and T. Miyoshi, “Neural-fuzzy ID3: a method of inducing fuzzy decision trees with linear programming for maximizing entropy and an algebraic method for incremental learning,” Fuzzy Sets and Systems, Vol. 81, pp. 157-167, 1995.
[15] J. Fong, J. G. Hughes, and J. Zhu, “Online web mining transactions association rules using frame metadata model,” in Proc. of the 2000 First Int. Conf. on Web Information Systems Engineering, Vol. 2, pp. 121-129, 2000.
[16] I. Y. Lin, X. M. Huang, and M. S. Chen, “Capturing user Access patterns in the web for data mining,” in Proc. IEEE 11th Int. Conf. On Distributed Computing Systems, March 1996.
[17] N. Megiddo and R. Srikant, “Discovering predictive association rules,” Proc. of the 4th Intl Conf. on knowledge Discovery in Databases and Data Mining, N. Y., August 1997.
[18] O. R. Zaiane, M. Xin, and J. Han, “Dsicovering web access patterns and trends by applying OLAP and data mining technology on web logs,” in Proc. Advances Digital Libraries Conf., pp. 19-29, Santa Barbara, CA, April 1998.
[19] C. Brunk, J. Kelly, and R. Kohavi., “An integrated system for data mining,” in Proc. of the 3rd Int. Conf. Knowledge Discovery and Data Mining, pp. 135-138, Newport Beach, CA, August 1997.
[20] L. X. Wang, J. M. Mendel, “Generating fuzzy rules by learning from examples,” IEEE Trans. on Syst. Man Cybern, Vol. 22, No 6, pp. 1414-1427, 1992.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top