跳到主要內容

臺灣博碩士論文加值系統

(44.211.24.175) 您好!臺灣時間:2024/11/05 16:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:王柏翔
研究生(外文):Po-Hsiang Wang
論文名稱:QueryFind:以使用者回饋及專家建議為基礎之網頁排序方式
論文名稱(外文):QueryFind: Search Ranking based on Users' Feedback and Expert's Agreement
指導教授:李漢銘李漢銘引用關係
指導教授(外文):Hahn-Ming Lee
學位類別:碩士
校院名稱:國立臺灣科技大學
系所名稱:電子工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:英文
論文頁數:70
中文關鍵詞:網頁排序使用者回饋搜尋引擎
外文關鍵詞:RankingUsers' FeedbackSearch Engine
相關次數:
  • 被引用被引用:0
  • 點閱點閱:191
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
目前搜尋引擎最大的挑戰就是將大量傳回的網頁做最有效的排序來滿足使用者的需求。因為傳統的排序方法都是基於網頁內容為導向來給予每個網頁一個排序分數,這些分數都是用一些經驗法則所計算得知且和使用者所下的查詢關鍵字也是獨立且不相關的。因此,傳回的網頁和使用者所想要的並不能完全的符合且使用者所需之最先相關的網頁將無法呈現在搜尋結果中的最前面。也就是說,使用者仍需要花時間從搜尋結果中找出他們所想要的網頁。
有鑑於此,我們於本篇論文中提出一種新的網頁排序方法,叫做QueryFind,這個方法不僅利用使用者回饋而且還有原始搜尋引擎的推薦。基於這個方法,我們利用使用者回饋來判斷每個網頁的品質且也利用了Meta-Search的慨念來給予每個網頁一個以內容為導向的排序分數。因此,使用者花在從搜尋結果中找出想要的網頁的時間可以減少且比較相關的網頁也可以呈現給使用者。在實驗中,我們使用蕃薯藤搜尋引擎(YAM)一個星期的查詢紀錄(Query Log)來評估我們的排序方法。另外,我們也提出一個新的評估準則來驗證QueryFind的可能性。這個評估準則是擷取網頁的排序順序和它被使用者點的次數來做評估。藉由實驗分析結果,使用者可以比以往更快速找到所需要的資訊。

Given a query word, search engines can retrieve vast amount of Web pages from the World Wild Web to users. However, the main challenge of search engines is to effectively rank vast retrieved Web pages to meet users’ needs. Because the traditional ranking method is based on content-oriented approaches to give each Web page a score for ranking, the ranking score is calculated by some sophisticated approaches and it is independent of users’ query words. Therefore, the relation between Web pages and users’ required information cannot be completely matched. In this manner, the most relevant Web pages to users’ query words might not be shown at the top of the search result list. That is, users still need to spend time for seeking out their required Web pages. Therefore, a novel ranking method named as QueryFind, based on learning from historical query logs, is proposed to predict users’ information needs and reduce the seeking time from the search result list. Our method uses not only the users’ feedback but also the source search engine’s recommendation. Based on this ranking method, we exploit users’ feedback to implicitly judge the Web pages’ quality. We also apply the meta-search concept to give each Web page a content-based ranking score. Therefore, the time users spend for seeking out their required information from search result list can be reduced and more relevant Web pages can be presented. In our experiments, Yam Search Engine’s query log over one week is used to evaluate. We also propose a novel evaluation approach to verify the feasibility of our ranking method. The approach is to capture the ranking order and Web pages that users have clicked from the search result list. Finally, our experiments show that the time users spend for seeking out their required information can be reduced significantly.

Abstract...............................................I
Acknowledgements.......................................III
Contents...............................................IV
List of Figures........................................VI
List of Tables.........................................VII
CHAPTER 1 Introduction.................................1
1.1 Motivation....................................1
1.2 Challenges of Current Ranking Methods.........3
1.3 Goals and Design..............................5
1.4 Outline of the Thesis.........................6
CHAPTER 2 Background...................................7
2.1 Performance Measure in Information Retrieval..8
2.2 Search Engines................................9
2.2.1 Subject Directory.......................10
2.2.2 Robot-Based Search Engine...............11
2.2.3 Meta-Search Engine......................13
2.3 Ranking Methods in Search Engines.............16
2.3.1 Ranking Based on Human Judgments........16
2.3.2 Ranking Based on Link Information.......17
2.3.3 Ranking Based on Users’Feedback........23
CHAPTER 3 QueryFind....................................25
3.1 Concept of QueryFind..........................26
3.2 Details of QueryFind..........................28
3.3 Architecture of QueryFind.....................32
3.3.1 Log Extractor...........................33
3.3.2 Query Grouping..........................34
3.3.3 Users’Feedback Collector...............35
3.3.4 Content-Oriented Converter..............36
3.3.5 Fusion Mechanism........................36
3.4 Characteristics and Limitations of QueryFind..38
3.4.1 Characteristics.........................38
3.4.2 Limitations.............................38
CHAPTER 4 Experiments..................................40
4.1 Characteristics of Experimental Data Set......40
4.2 Evaluation Criterion..........................42
4.3 Experimental Results..........................46
CHAPTER 5 Discussion and Conclusion....................56
5.1 Discussion....................................56
5.2 Conclusion....................................58
5.3 Further Work..................................59
REFERENCES ............................................61

[1] C. Aggarwal and P. Yu, “Data Mining Techniques for Personalization,” IEEE Data Engineering Bulletin, Vol. 23, No. 1, pp. 4-9, 2000.
[2] B. Amento, L. Terveen, and W. Hill, “Does Authority Mean Quality? Predicting Expert Quality Ratings of Web Documents,” In Proceedings of 23th International ACM SIGIR, pp. 296-303, 2000.
[3] M. Angelaccio and B. Buttarazzi, “Local Searching the Internet,” IEEE Internet Computing, Vol. 6, No. 1, pp. 25-33, 2002.
[4] A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, and S. Raghavan, “Searching the Web,” ACM Transactions on Internet Technology, Vol. 1, No. 1, pp. 97-101, 2001.
[5] R. Baeza-Yates and B. Riberiro-Neto, Modern Information Retrieval, Addison-Wesley, 1999.
[6] M. Balabanovic and Y. Shoham, “Fab: Content-based, Collaborative Recommendation,” Communications of the ACM, Vol. 40, No. 3, pp. 66-72, March 1997.
[7] A. Banerjee and J. Ghosh, “Concept-based Clustering of Clickstream Data,” In Proceedings of 3rd International Conference on Information Technology, Bhubaneshwar, pp. 145-150, 2000.
[8] D. Beeferman, and A. Berger, “Agglomerative clustering of a search engine query log,” Knowledge Discovery and Data Mining, pp. 406- 416, 2000.
[9] H. Berghel, “Cyberspace 2000: Dealing with Information overload,” Communications of the ACM, Vol. 40, No. 2, pp. 19-24, Feb. 1997.
[10] K. Bharat and G..A. Mihaila, “When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics,” ACM Transactions on Information Systems, Vol. 20, No. 1, pp. 47-58, 2002.
[11] A. Borchers, J. Herlocker, J. Konstanand, and J. Riedl, “Ganging up on information overload,” Computer, Vol. 31, No. 4, pp. 106-108, Apr. 1998.
[12] S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Computer Networks and ISDN Systems, Vol. 30, pp. 107-117, 1998.
[13] S. Brin and L. Page, “The PageRank Citation Ranking: Bringing Order to the Web,” In Proceedings of ASIS’98, pp. 161-172, 1998.
[14] J. Budzik, K. J. Hammond, L. Birnbaum, and M. Krema, ”Beyond Similarity,” Proceedings of the 2000 Workshop on Artificial Intelligence and Web Search, AAAI Press, 2000.
[15] J.P. Callan, Z. Lu, W.B. Croft, “Searching distributed collections with inference networks,” In Proceeding of the ACM SIGIR Conference, pp. 21-28, Seattle (July 1995).
[16] S. Chakrabarti, “Data Mining for Hypertext: A Tutorial Survey,” ACM SIGKDD Explorations, Vol. 1, No. 2, pp. 1-11, 2000.
[17] S. Chakrabarti, B.E. Dom, S. R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J. Kleinberg, “Mining the Web's Link Structure,” IEEE Computer, Vol. 32, pp. 60-67, 1999.
[18] P. Chan, “Constructing Web User Profiles: A Non-invasive Learning Approach,” In Web Usage Analysis and User Profiling, LNAI 1836, pp. 39-55, 2000.
[19] C. H. Chang and C. C. Hsu, “Enabling Concept-Based Relevance Feedback for Information Retrieval on the WWW,” IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 4, pp. 595-609, July/August 1999.
[20] C. H. Chi, C. Ding, and K. Y. Lam, “Study For Fusion Of Different Sources To Determine Relevance,” Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’02), 2002.
[21] B. Chidlovskii, N. Glance, and A. Grasso, “Collaborative Re-Ranking of Search Results,” In Proceedings of AAAI-2000 Workshop on Artificial Intelligence for Web Search, Austin, Texas, pp. 18-22, 2000.
[22] M. Claypool, D. Brown, P. Le, and M. Waseda, “Inferring User Interest,” IEEE Internet Computing, Vol. 5, No. 6, pp. 32-39, Nov.-Dec. 2001.
[23] J. Conklin, “Hypertext: An introduction and survey,” Computer, Vol. 20, No. 9, pp. 17- 41, 1987.
[24] G. Culliss, “User Popularity Ranked Search Engines Gary Culliss Chairman and Cofounder Direct Hit Technologies,” April 1999, available at: http://www.infonortics.com/searchengines/boston1999/culliss/
[25] J. Dean and M.R. Henzinger, “Finding Related Pages in World Wide Web,” In Proceedings of the 8th International World Wide Web Conference, pp. 389-401, 1999.
[26] D. Dreilinger and A.E. Howe, “Experience with selecting search engine using metasearch,” ACM Transaction on Information System, Vol. 15, No. 3, pp. 195-222, (July 1997).
[27] L. Finkelstenin, E. Gabrilovich, Y. Matias, and E. Ruppin, G. Wolfman, and E. Ruppin, “Placing Search in Context: The Concept Revisited,” ACM Transactions on Information Systems, Vol. 20, No. 1, pp. 116-131, 2002.
[28] A. Garratt, M. Jackson, P. Burden, and J. Wallis, “A Survey of Alternative for A Search Engine Storage Structure,” Information and Software Technology, Vol. 43, No. 11, pp. 661-677, 2001.
[29] E. J. Glover, S. Lawrence, W. P. Birmingham, and C. L. Giles, “Architecture of a Metasearch Engine that Supports User Information Needs,” Proceedings of the 8th International Conference on Information Knowledge Management, pp. 210-216, 1999.
[30] N. Good, J. Schafer, J. Konstan, A. Borchers, and B. Sarwer, “Combining Collaborative Filtering with Personal Agents for Better Recommendations,” In Proceedings of the American Association of Artificial Intelligence AAAI-99, pp. 439-446, 1999.
[31] C. Gurrin and A.F. Smeaton, “A Connectivity Analysis Approach to Increasing Precision in Retrieval from Hyperlinked Documents,” In Proceedings of TREC8, Washington DC, pp. 357-366, 1999.
[32] E. Gutman, “Innovative Search Methods,” CIS 732 Research Paper, 2000.
[33] J. Han and C.C. Chang, “Data mining for web intelligence,” Computer, Vol. 35, No. 11, pp. 64 -70, 2002.
[34] J. Han and M. Kamber, Data Mining Concept and Techniques, Morgan Kaufmann, ISBN 1-55860-489-8, 2000.
[35] M.R. Herzinger, “Hyperlink Analysis for the Web,” IEEE Internet Computing, Vol. 5, No.1, pp. 45-50, 2001.
[36] L. Huang, “A Survey on Web Information Retrieval Technologies,” Working Paper, http://citeseer.nj.nec.com/336617.html.
[37] B. J. Jansen, A. Spink, and T. Saracevic, “Real life, real users, and real needs: A study and analysis of user queries on the web,” Information Processing and Management, Vol. 36, No. 2, pp. 207-227, 2000.
[38] M. F. Jiang, S. S. Tseng and Y. T. Lin, “Collaborative Rating System for Web Page Labeling,” World Conference of the WWW and Internet, Honolulu, Hawaii, USA, Vol. 1, pp. 569-574, 1999.
[39] T. Joachims, ”Optimizing Search Engines Using Clickthrough Data,” Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002.
[40] L. Kerschberg, W. Kim, and A. Scime, “Intelligent Web Search via Personalizable Meta-Search Agents,” International Conference on Ontologies, Databases and Applications of Semantics (ODBASE 2002), (Irvine, CA, 2002).
[41] K. J. Kim and S. B. Cho, “A Personalized Web Search Engine Using Fuzzy Concept Network with Link Structure,” IFSA World Congress and 20th NAFIPS International Conference, Vol. 1, pp. 81-86, 2001.
[42] J. Kleinberg, “Authoritative sources in a hyperlinked environment,” In Proceedings of 9th ACM-SIAM Symposium on Discrete Algorithms, pp. 668-679, 1998.
[43] M. Kobayashi and K. Takeda, “Information Retrieval on the Web,” ACM Computing Surveys, Vol. 32, No. 2, pp. 144-173, 2000.
[44] J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl, “GroupLens: Applying Collaborative Filtering to Usenet News,” Communications of ACM, Vol. 40, No. 3, pp. 77-87, March 1997.
[45] R. Kosala and H. Blockeel, “Web Mining Research: A Survey,” ACM of SIGKDD Explorations, Vol. 2, No.1, pp. 1-15, 2001.
[46] H. Lai and T. C. Yang, “A System Architecture for Intelligent Browsing on the Web,” Decision Support Systems, Vol. 28, No. 3, pp. 219-239, 2000.
[47] S. Lawrence and C. L. Giles, “Searching the World Wide Web,” Science, Vol. 280, No. 5360, pp. 98-100, 1998.
[48] C. H. Lee, Y. H. Kim, and P. K. Rhee, “Web personalization expert with combining collaborative filtering and association rule mining technique,“ Expert Systems with Applications, Vol. 21, No. 3, October, pp. 131-137, 2001.
[49] T. X. Lin, “HitRank: Search Ranking Based on Mining User-oriented Feedback,” Master Thesis, National Taiwan University of Science and Technology, 2002.
[50] S. K. Madria, S. Bhowmick, W. Ng, and E. Lim, “Research Issues in Web Data Mining,” Proceedings of 1st International Conference on Data Warehousing and Knowledge Discovery, pp. 303-312, 1999.
[51] U. Manber, A. Patel, and J. Robision, “Experience with Personalization on YAHOO!,” Communications of the ACM, Vol. 43, No. 8, pp. 35-39, 2000.
[52] F. Menczer, “Complementing Search Engine with Online Web Mining Agents,” Decision Support Systems, Vol. 35, No. 2, pp. 195-212, 2003.
[53] W. Meng, C. Yu, and K. Liu, “Building Efficient and Effective Metasearch Engines, ACM Computing Surveys,” Vol. 34, No. 1, pp. 48-89, March 2002.
[54] B. Mobasher, H. Dai, T. Luo, and M. Nakagawa, “Effective Personalization Based on Association Rule Discovery from Web Usage Mining,” In Proceedings of the 3rd ACM Workshop on Web Information and Data Management, pp. 9-15, 2001.
[55] B. Mobasher, R. Cooley, and J. Srivastava, “Automatic Personalization Based on Web Usage Mining,” Communications of the ACM, Vol. 43, No. 8, pp. 142-151, August 2000.
[56] Z. Nick and P. Themis, “Web search using a genetic algorithm", IEEE Internet Computing, Vol. 5, No. 2, pp. 18-26, March-April 2001.
[57] B.U. Oztekin, G. Karypis, and V. Kumar, “Expert Agreement and Content Based Reranking in a Meta Search Environment using Mearf,” Proceedings of the eleventh international conference on World Wide Web, pp. 333-344, 2002.
[58] B. Padmanabhan, Z. Q. Zheng, and S. O. Kimbrough, “Personalization from Incomplete Data: What You Don’t Know Can Hurt,” The 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001), pp. 154-163, August 2001.
[59] J. Pitkow and P. Pirolli, “Mining Longest Repeating Subsequences to Predict World Wide Web Surfing,” In Proceedings of the Second USENIX Symposium on Internet Technologies and Systems, pp. 139-150, 1999.
[60] G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, Vol. 18, No. 11, pp. 613-620, 1975.
[61] G. Salton and M. J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983.
[62] E. Selberg and O. Etzioni, “The metacrawler architecture for resource aggregation on the web,” IEEE Expert, Vol. 12, No. 1, pp. 8-14, 1997.
[63] Z. Shanfeng, D. Xiaotie, C. Kang, and Z. Weimin, “Using Online Relevance Feedback to Build Effective Personalized Metasearch Engine,” In Proceedings of the Second International Conference on Web Information Systems Engineering, Vol. 1, pp. 262-268, 3-6 December 2001.
[64] C. Silverstein, M. Henzinger, H. Marais, and M. Moricz, “Analysis of a Very Large AltaVista Query Log,” SRC Technical Note 1998-014.
[65] A. Spink , D. Wolfram, B. J. Jansen, and T. Saracevic, “Searching the web: The public and their queries,” Journal of the American Society for Information Science, Vol. 53, No. 2, pp. 226-234, 2001.
[66] J. Srivastava, R. Cooley, M. Deshpande, and P-N Tan, “Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data,” ACM of SIGKDD Explorations, Vol. 1, No. 2, pp. 12-23, 2000.
[67] R. E. Walope, R. H. Myers, and S. L. Myers, Probability and Statistics for Engineers and Scientists, sixth edition, Prentice Hall, New Jersey, 1998.
[68] J. Wen, J. Nie, and H. Zhang, “Query Clustering Using User Logs,” ACM Transactions on Information Systems, Vol. 20, No. 1, pp. 59-81, 2002.
[69] R. W. White, I. Ruthven, and J. M. Jose, “Finding Relevant Document using top Ranking Sentences: An Evaluation of Two Alternative Schemes”, In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 57-64, 2002.
[70] R. W. White, I. Ruthven, and J. M. Jose, “The use of implicit evidence for relevance feedback in web retrieval”, In Proceedings of 24th BCS-IRSG European Colloquium on IR Research. Lecture notes in Computer Science 2291, Glasgow, pp. 93-109, 2002.
[71] Y. H. Wu, Y. C. Chen, and A. L. P. Chen, “Enabling Personalized Recommendation on the Web Based on User Interests and Behaviors,” In Proceedings of 11th International Workshop on Research Issues in Data Engineering, pp. 17-24, 2001.
[72] B. Yuwono and D. Lee, “Search and Ranking Algorithms for Locating Resources on the World Wide Web,” Proceeding of 12th International Conference on Data Engineering, New Orleans, pp. 164-171, Feb. 1999.
[73] B. Yuwono and D. Lee, “Server ranking for distributed text resource systems on the Internet,” In proceeding of the 5th International Conference Systems For Advanced Application, Melbourne, pp. 391-400, Australia (April 1997).
[74] D. Zhang and Y. Dong, “An Efficient Algorithm to Rank Web Resources,” In Proceedings of 9th International World Wide Web Conference, pp. 449-455, 2000.
URL lists:
[75] AltaVista, http://www.altavista.com.
[76] Clever, http://www.almaden.ibm.com/cs/k53/clever.html.
[77] Excite, http://www.excite.com.
[78] Google, http://www.google.com.
[79] Metacrawler, http://www.metacrawler.com.
[80] NEC Research Institute ResearchIndex, http://citeseer.nj.nec.com.
[81] OPENFIND, http://www.OPENFIND.com.tw.
[82] ProFusion, http://www.profusion.com.
[83] Savvy, http://www.search.com.
[84] Search Engine Watch, http://www.searchenginewatch.com.
[85] Teoma, http://www.teoma.com/.
[86] WISEnut, http://www.wisenut.com.
[87] Yahoo, http://www.yahoo.com.
[88] YAM, http://www.yam.com.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top