跳到主要內容

臺灣博碩士論文加值系統

(44.222.134.250) 您好!臺灣時間:2024/10/07 03:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:蘇俊華
研究生(外文):Jiun-Hua Su
論文名稱:運用智慧型代理人技術建立領域相關入口網站
論文名稱(外文):Applying Intelligent Agent Techniques to Construct Domain Portals
指導教授:林宣華林宣華引用關係
指導教授(外文):Shian-Hua Lin
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:28
中文關鍵詞:智慧型代理人支持向量機特徵選取階層式分類文件分類概念階層Focused Crawler
外文關鍵詞:Intelligent AgentSVMFeature SelectionHierarchical ClassificationDocument ClassificationFocused Crawler
相關次數:
  • 被引用被引用:0
  • 點閱點閱:277
  • 評分評分:
  • 下載下載:48
  • 收藏至我的研究室書目清單書目收藏:1
隨著網際網路的興起,數位資料大量的成長,使用者的應用需求也日益增多,例如:只針對某一個領域去搜尋、能夠提供更多的服務。為此,如果能夠提供一個領域相關入口網域則能符合使用者的要求。因此,我們提出了使用智慧型代理人機制去快速的建立領域相關入口網站之系統技術。在本論文中,我們利用Focused Crawler的技術去抓取領域相關網頁資料,並利用我們所設計之Concept Hierarchy Generator (CHG)自動去建立概念階層 (Concept Hierarchy) 以取代Focused Crawler中的Taxonomy,來減少入力的介入。CHG之實作是根據利用熵 (Entropy) 分析網站之分佈來判斷其網站類型:靜態式呈現網站 (Static Portal) 或動態式呈現網站 (Dynamic Portal),並根據網站類型的不同,使用由上往下 (Top-down) 或由下往上 (Bottom-up) 的方法去分析網頁鏈結關係,以建立概念階層。我們已成功的建立各種不同網站之概念階層,並且詳細的規劃智慧型代理系統之系統架構。
Recently, digital information is rapidly growing on the Internet environment. User’s requirements from simple and general information search to complex domain knowledge extraction. Therefore, general portals are moving toward to fine-grain domain portal so that the information system is able to serve more complex and intelligent requirements. Hence, inteligent agent studies become more and more important. To carry out an intelligent agent system for any specified domains, we employ Focused Crawler techniques to retrieve domain-related information. Then, we propose a novel approach to automatically construct Concept Hierarchy so that the manual Taxonomy used Focused Crawler can be replaced by our Concept Hierarchy Generator (CHG). By classifying types of the Web information into: static and dynamic contents, our CHG is able to extract a website’s concept hierarchy based on analyzing the Entropy distribution of URLs. Following links of the website root, top-down and bottom-up hyperlink analysis approaches are applied to extract concept hierarchies of static and dynamic contents, respectively. Experiments show that our system is able to correctly extract concept hierarchies of both types of websites.
Keywords: Intelligent Agent, SVM, Feature Selection, Hierarchical Classification, Document Classification, Focused Crawler.
中文摘要 I
Abstract II
目錄 III
圖目錄 V
1. 緒論 1
1.1. 研究動機 1
1.2. 研究目的 2
1.3. 研究方法 2
1.4. 論文架構 2
2. 文獻探討 4
2.1. 智慧型代理人 4
2.1.1. 智慧型代理人的定義與特性 4
2.2. 熵 (Entropy) 5
2.3. Focused Crawler 6
2.4. 網頁鏈結分析 8
2.4.1. Hyperlink Induced Topics Search (HITS) 8
2.4.2. PageRank 10
2.5. 文件分類 11
2.5.1. 前置處理 11
2.5.2. 特徵選取 12
2.6. 分類器 13
2.6.1. KNN (K Nearest Neighbor) 13
2.6.2. 貝氏分類器 (Navie-Bayes) 14
2.6.3. 支持向量機 (SVM , Support Vector Machine, SVM) 14
3. 自動產生概念階層 16
4. 系統架構 19
4.1. Concept Hierarchy Generator 20
4.1.1. Hierarchy Construct Model 20
4.2. Focused Crawler 22
4.2.1. Hierarchical Classifier 22
4.2.2. Distiller 25
5. 結論與未來方向 26
5.1. 結論 26
5.2. 未來方向 26
參考文獻 27
[1]Kleinberg, J. M., “Authoritative sources in a hyperlinked environment,” In Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, 1998.
[2]Brin, S. and Page, L., “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” In Proceedings of the 7th international World Wide Web Conference Vol.7, 1998.
[3]Quinlan, J. R., “Introduction of Decision Trees,” Machine Learning, 1(1):86-106, 1986.
[4]Pearl, J. “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,” Morgan-Kaufmann, 1988.
[5]Cortes, C. and Vapnik, V., “Support-Vector Networks,” Machine Learning, 20(3):273-297, September 1995.
[6]Salton, G and McGill, M. “Introduction to Modern Information Retrieval,” McGraw Hill, 1983.
[7]Sergej S., Michael B., Jens G., Stefan S., Martin T., Gerard W. And Patrick Z., “The BINGO! System for Information Portal Generation and Expert Web Search,” CIDR Conference, 2003.
[8]Chakrabarti, S., Berg, M. V. D. and Dom, B., "Focused crawling: a new approach to topic-specific Web resource discovery," WWW Conference, 1999.
[9]Sycara, K., A. Pannu, M. Williamson and Zeng, D., “Distributed Intelligent Agents,”IEEE Expert, 1996.
[10]Wooldridge, M., and N. R. Jennings,“Intelligent Agents: Theory and Practice,”The Knowledge Engineering Review, vol. 10(2), pp.115-152, 1995.
[11]Azzam, M.M., and Awad, A.M., “Entropy Measures and Some Distribution Approximations,” Microelectron. Reliab, Vol. 36, No. 10, pp. 1569-1580 (1996).
[12]Lin, S. H., Hsu, T. Y., Feng, K. J. and Yang, W. P., “Feature Selection Methods for Metadata Classification on Hierarchical Union Catalogs,” ICDAT, 2005.
[13]Kuan-Jen F., “An Efficient Hierarchical Metadata Classifier based on SVM and Feature Selection Methods”
[14]Chang, C. C. and Lin, C. J., “LIBSVM : a library for support vector machines,” 2001.
[15]Sergej S., Jens. G. And Martin T., “From Focused Crawling to Expert Information: an Application Framework for Web Exploration and Portal Generation,” 29th VLDB Conferences
[16]JON M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” ACM-SIAM, 1998.
[17]Mark S., And Bruce C., “Deriving concept hierarchies from text,” SIGIR’99.
[18]Boser, B. E., Guyon, I. M. and Vapnik, V., “A training algorithm for optimal margin classifiers,” In Fifth Annual Workshop on Computational Learning Theory, pages 144–152, Pittsburgh, 1992. ACM.
[19]Aris, A., Andrei Z. B. and David C., “Sampling Search-Engine Results,” WWW Conference, 2006.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top