跳到主要內容

臺灣博碩士論文加值系統

(54.224.117.125) 您好!臺灣時間:2022/01/28 19:35
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張永霖
研究生(外文):Yeong-Lin Chang
論文名稱:使用基因演算法與相關回饋於協助網頁搜尋
論文名稱(外文):Using a Genetic Algorithm and Relevance Feedback in Assisting Web Search
指導教授:周世傑周世傑引用關係
指導教授(外文):Shih-Chieh Chou
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:54
中文關鍵詞:詞頻資訊檢索相關回饋基因演算法全球資訊網
外文關鍵詞:information retrievalWorld Wide Webgenetic algorithmrelevance feedbackterm frequency
相關次數:
  • 被引用被引用:8
  • 點閱點閱:280
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
全球資訊網的資料量從發展之初至今呈現加速度成長的趨勢,早已成為人們獲取資訊的重要管道之一,因為其資料量龐大,全球資訊網的使用者遭遇了資訊過載的問題,因而求助於目錄服務與搜尋引擎。但目錄服務限於維護人力不足,提供索引的網頁往往不敷使用;搜尋引擎的搜尋結果與使用者的資訊需求往往相關性過低,或搜尋結果過多,使用者仍須自行過濾所需的資訊。
本研究試圖建置一個架構於現有搜尋引擎之上的智慧型代理人。根據使用者提供相關的範例文件,建立使用者興趣檔。運用基因演算法搜尋可能的查詢字串,透過搜尋引擎蒐集網路上的相關網頁,以向量空間模式表示各網頁文件的內容,並評估網頁與使用者興趣檔的相似程度,藉此引導基因演算法搜尋更適合的查詢字串。並根據使用者對檢索結果的評等,配合相關回饋機制調整使用者興趣檔,逐次改進查詢的效果。本研究實作之系統兼顧無檢索主題限制、軟硬體需求低且使用者額外負擔少等方面,對於網頁搜尋及使用者興趣學習上,有令人滿意的表現。
World Wide Web(WWW) is growing faster and faster since its emergence, and is one of our major information sources in daily life. Because the data quantity of WWW approximates to infinity, information-overloading problems bother users all the time. To retrieve information in the right scope and content has become an important issue. Presently, directory services and keywords searching are two major ways that can help users. Unfortunately, both ways have shortcomings. Referring to directory services, the problem is that a labor-intensive activity is required to create and maintain directory. Besides, web pages are usually not fully indexed. As for keywords searching, the problem is that too many unrelated information are usually provided that users have to spend a lot of effort on filtering.
Our research tries to construct an intelligent agent to assist information retrieving. It builds a user profile by parsing and analyzing example documents provided by the user. It uses a genetic algorithm to search the possible query strings combined by the keywords from the user profile. It collects the relevant web pages via search engine. Each web page is represented in vector space model. It tries to search more fitting query strings by evaluating the similarity between web pages and user profile. According to the user evaluations, relevance feedback mechanism refines the user profile to improve the query results. This proposed system provides a satisfying performance in web search and learning users’ interests. It can work for every search subject with low software and hardware requirement and less user extra interferences.
目錄
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 研究範圍與限制 2
1.4 研究流程 3
1.5 論文架構 3
第二章 文獻探討 5
2.1 資訊檢索 5
2.2 全球資訊網與搜尋服務 10
2.3 基因演算法 14
2.4 相關回饋 17
第三章 系統設計 20
3.1 研究構想 20
3.2 系統架構 23
第四章 實驗結果 37
4.1 實驗設計與進行 37
4.2 實驗結果與分析 38
第五章 結論 47
5.1 研究結論與貢獻 47
5.2 未來研究方向 48
參考文獻 49
附錄一 Stop list of words 53
圖目錄
圖1-1:研究流程圖 4
圖2-1:資訊檢索模式圖 6
圖2-2:資訊過濾模式圖 6
圖2-3:字詞出現頻率與字數排列關係圖 7
圖2-4:網際網路成長趨勢圖 11
圖2-5:搜尋引擎索引容量圖 13
圖2-6:基因演算法架構 15
圖2-7:Webnaut 相關回饋的參數設定 19
圖3-1:資訊檢索方法示意圖 20
圖3-2:系統外部環境圖 23
圖3-3:系統模組圖 25
圖3-4:染色體編碼範例 32
圖4-1:實驗一結果(折線圖) 39
圖4-2:實驗二結果(折線圖) 41
圖4-3:實驗三結果(折線圖) 42
圖4-4:實驗四結果(折線圖) 44
圖4-5:實驗五結果(折線圖) 45
表目錄
表3-1:本研究與 Webnaut 的異同比較 22
表3-2:系統軟硬體環境 24
表3-3:使用者興趣檔建立範例 26
表3-4:詞頻調整策略 27
表3-5:敏感度調整策略 28
表3-6:詞頻調整參數 28
表3-7:詞頻調整範例 29
表3-8:敏感度調整範例 30
表3-9:使用者興趣檔範例 33
表3-10:文件詞頻資料範例 33
表3-11:使用者興趣檔規格 36
表4-1:實驗的操縱變數設定 38
表4-2:實驗一的變數設定 39
表4-3:實驗一結果(數據彙整) 39
表4-4:實驗二的變數設定 40
表4-5:實驗二結果(數據彙整) 40
表4-6:實驗三的變數設定 42
表4-7:實驗三結果(數據彙整) 42
表4-8:實驗四的變數設定 43
表4-9:實驗四結果(數據彙整) 43
表4-10:實驗五的變數設定 45
表4-11:實驗五結果(數據彙整) 45
參考文獻中文部分[1] 吳俊興,「網際網路分類搜尋引擎設計之研究」,台灣大學資訊工程研究所博士論文,民88。[2] 莊慧美,「以智慧型計算方法探索文件分類」,屏東科技大學資訊管理研究所碩士論文,民88。[3] 陳建銘,「智慧型瀏覽代理程式於網站上的應用」,淡江大學資訊工程研究所碩士論文,民89。[4] 曾引蕙,「涵義導向之網頁自動學習與分類」,台灣大學資訊管理研究所碩士論文,民89。[5] 嚴嘉錚,「以相關回饋增進搜尋引擎使用效率之代理程式建構」,雲林科技大學資訊管理技術研究所碩士論文,民87。英文部分[6] Andrew S. Tanenbaum, Computer Network, Prentice-Hall, 1996.[7] Bernard J. Jansen & Amanda Spink & Tefko Saracevic, “Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web,” Information Processing and Management, 36(2), 207-227, 2000.[8] Brian H. Murray & Alvin Moore, “Sizing the Internet,” Cyveillance, July 2000.[9] C. J. van Rijsbergen, Information Retrieval, Butterworth, 1975.[10] Chris Buckley & Gerard Salton & James Allan, “The Effect of Adding Relevance Information in a Relevance Feedback Environment,” In Proceedings of the 7th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 292-300, 1994.[11] Daniel Dreilinger & Adele E. Howe, “Experiences with Selecting Search Engines Using Metasearch,” ACM Transactions on Information Systems, 15(3), 195-222, 1997.[12] David E. Goldberg, Genetic Algorithm in Search, Optimization, and Machine Learning, Addison-Wesley, 1989.[13] Donna Harman, “Relevance Feedback Revisited,” In Proceedings of the 15th ACM SIGIR International Conference on Research and Development in Information Retrieval, 1-10, 1992.[14] Douglass R. Cutting & David R. Karger & Jan O. Pedersen & John W. Tukey, “Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collection,” In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 318-329, 1992.[15] Erik Selberg & Oren Etzioni, “The MetaCrawler Architecture for Resource Aggregation on the Web,” IEEE Expert, 12(1), 11-14, 1997.[16] Gerard Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, 1989.[17] Gerard Salton, “The SMART Project-status Report and Plan,” The SMART Retrieval System: Experiments in Automatic Document Processing, Prentice-Hall, 3-11, 1971.[18] Gerard Salton & M. J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983.[19] Hsinchun Chen, “Machine Learning for Information Retrieval: Neural Network, Symbolic Learning, and Genetic Algorithms,” Journal of the American Society for Information Science, 46(3), 194-216, 1995.[20] Hsinchun Chen & Jinwoo Kim, “GANNET: A Machine Learning Approach to Document Retrieval,” Journal of Management Information Systems, 11(3), 7-41, 1994.[21] J. H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, 1975.[22] J. J. Rocchio, “Relevance Feedback in Information Retrieval,” The SMART Retrieval System: Experiments in Automatic Document Processing, Prentice-Hall, 313-323, 1971.[23] K. A. De Jone, “On Using Genetic Algorithms to Search Program Space,” In Proceedings of the 2nd International Conference on Genetic Algorithms, 210-216, 1987.[24] Koji Eguchi & Hidetaka Ito & Akira Kumamoto & Yakichi Kanata, “Adaptive and Incremental Query Expansion for Cluster-based Browsing,” In Proceedings of the 6th International Conference on Database Systems for Advanced Applications, 25-34, 1999.[25] M. Porter, “An Algorithm for Suffix Stripping,” Program, 14(3), 130-138, 1980.[26] Marti A. Hearst & Jan O. Pedersen, “Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results,” In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 76-84, 1996.[27] Michael Gordon, “Probabilistic and Genetic Algorithms for Document Retrieval,” Communications of the CAM, 31(10), 1208-1218, 1988.[28] Nicholas J. Belkin & W. Bruce Croft, “Information Filtering and Information Retrieval: Two Side of the Same Coin?” Communications of the CAM, 35(12), 29-38, 1992.[29] Peter Lyman & Hal R. Varian, “How Much Information?” Regents of the University of California, October 2000.[30] Praveen Pathak & Michael Gordon & Weiguo Fan, “Effective Information Retrieval using Genetic Algorithms based Matching Functions Adaptation,” In Proceedings of the 33rd Annual Hawaii International Conference on System Sciences, 533-540, 2000.[31] Ricardo Baeza-Yates & Berthier Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, 1999.[32] S. E. Robertson & K. Sparck Jones, “Relevance Weighting of Search Terms,” Journal of the American Society for Information Science, 27(3), 129-146, 1976.[33] Steve Lawrence & C. Lee Giles, “Searching the World Wide Web,” Science, 280(5360), 98-100, 1998.[34] William B. Frakes & Ricardo Baeza-Yates, Information Retrieval: Data Structures and Algorithms, Prentice-Hall, 1992.[35] Zacharis Z. Nick & Panayiotopoulos Themis, “Web Search Using a Genetic Algorithm,” IEEE Internet Computing, 5(2), 18-26, 2001.網站部分[36] GNU’s Not Unix!http://www.gnu.org/.[37] Google.http://www.google.com/.[38] Jupiter Media Metrix.http://www.jmm.com/.[39] LiveTopics Help.http://tsc.k12.in.us/training/SEARCH/ALTAVIST/help.htm.[40] Netcraft Web Server Survey.http://netcraft.com/.[41] Search Engine Watch.http://searchenginewatch.com/.[42] Sugal Project.http://www.dur.ac.uk/andrew1.hunter/Sugal/.[43] The Internet Engineering Task Force.http://www.ietf.org/.[44] The Porter Stemming Algorithm.http://www.tartarus.org/~martin/PorterStemmer/.[45] The World Wide Web Consortium.http://www.w3.org/.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top