跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.88) 您好!臺灣時間:2026/02/16 02:14
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:邱紹軒
研究生(外文):CHIOU, SHAO-SYUAN
論文名稱:Terrier 資訊檢索平台實作與評估互動式資訊搜尋:以TREC 6為例
論文名稱(外文):Interactive Information Retrieval Evaluations by Implementation of Terrier IR Platform: TREC 6 Dataset
指導教授:吳怡瑾吳怡瑾引用關係
指導教授(外文):Wu, I-Chin
口試委員:楊千陳子立
口試委員(外文):Yang, ChyanChen, Tzu-Li
口試日期:2014-07-09
學位類別:碩士
校院名稱:輔仁大學
系所名稱:資訊管理學系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:83
中文關鍵詞:互動式資訊檢索資訊檢索模型字詞推薦Terrier IR PlatformTREC 6
外文關鍵詞:Interactive Information RetrievalIR ModelsTerm SuggestionTerrier IR PlatfromTREC 6
相關次數:
  • 被引用被引用:1
  • 點閱點閱:464
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
本研究使用文件檢索會議(Text Retrieval Conference ,TREC)中的TREC 6資料集做為測試資料,及其提供的Topics作為測試問題。研究透過Terrier IR工具的命令提示字元模式,測試不同檢索模型(向量模型的TF_IDF、機率模型的BM25以及語言模型的Hiemstra_LM)以及不同標籤(標題T及標題加描述T+D)進行檢索,探討何種情況所得到的精確率較高。研究並進一步依照同樣的測試方法,加上字詞擴展進行檢索後,探討問題集的字數長短是否會影響精確度。此外,本研究修改Terrier IR工具所提供的字詞擴展類別,設計出一個互動式資訊檢索介面,並使用TREC 6資料集、Topics做為測試資料集,請受測者進行測試,除了了解在使用此介面時各個模型的相關文章篇數及平均精確度的優劣之外,也從中了解到字詞推薦功能是否有效,並透過側錄了解不同情況下使用者的操作行為是否有差異。
本研究結果顯示,在命令提示字元模式中,未使用字詞擴展情況下,標題加描述(T+D)中的語言模型(Hiemstra_LM)表現最佳,而在使用字詞擴展後,不管是(T)和(T+D)中,Hiemstra_LM的表現都是最差的,這表示此模型在Terrier IR工具中較不適合處理字數較多的問題。在互動式資訊檢索介面下,不管是否使用字詞推薦功能的情況下,相關文章的篇數及平均精確度皆為Hiemstra_LM表現最佳。最後,研究透過側錄影片發現,在無使用字詞推薦的情況下,Hiemstra_LM可以讓使用者快速、準確的找到相關的資訊。研究結果顯示,Hiemstra_LM語言模型可以協助建立好的索引庫並推薦適當的關鍵字以協助使用者進行互動式資訊檢索。

In this study, we used TREC-6 document and topic sets which, through Text Retrieval Conference, served as our data set. We used different information retrieval models (i.e., TF_IDF, BM25, and Hiemstra_LM) and different tags (with title “T” and with title and description “T+D”) by Terrier IR Command Line Mode to retrieve documents from TREC-6. Our aim was to determine which model would achieve higher precision under different experimental settings. We then tested the query expansion function to investigate the effects of the length of topics for different IR models. Furthermore, we created an interactive information retrieval (IR) interface using a modified query expansion class provided by Terrier IR, and we selected two topics from the TREC-6 document and topic sets. The study involved 18 users who were recruited as our evaluation subjects. We aimed to investigate (1) the effectiveness of the three models in terms of the number of retrieved relevant documents by subject as well as average precision; (2) the effectiveness of the interactive IR with the help of term suggestions or without term suggestions. We also recorded and analyzed the subjects’ search behaviors using Morae software.
The results indicate that in Command Line Mode, when we used title with description “T+D” but no query expansion, the language model of Hiemstra_LM showed the best performance. Interestingly, when we used query expansion, the Hiemstra_LM model demonstrated the worst performance. This suggests that this model as a Terrier IR tool is not fit to process topics with more words. In our proposed interactive IR interface, regardless of whether term suggestions were given, subjects who used the Hiemstra_LM model achieved the best performance. Moreover, when subjects used the interface with the term suggestion function, they were also able to find a greater number of relevant documents compared to the interface without term suggestion. Finally, according to our recording videos, we found that when using the interface without the term suggestion function, the Hiemstra_LM model let users find relevant documents more quickly and accurately. In conclusion, the Hiemstra_LM model is capable of building good index terms and suggesting proper terms for helping users achieve better interactive IR.

表次
圖次
第壹章 緒論
第一節 研究背景
第二節 研究動機與目的
第貳章 文獻探討
第一節 IR模型
一、布林模型(Boolean Model)
二、向量模型(Vector Model)
三、機率模型(Probability Model)
四、語言模型(Language Models)
第二節 文件檢索會議(TREC)
第三節 TERRIER 資訊檢索平台
第四節 相關回饋與字詞擴展技術與應用
一、相關回饋(Relevance Feedback)
二、字詞擴展(Query Expansion)
三、相關應用
第參章 研究方法
第一節 研究問題與定義
第二節 研究架構
第三節 相關工具介紹
一、TREC 6
二、Terrier各種模型
第肆章 TREC資料集實驗分析
第一節 各模型測試結果
第二節 各檢索結果比較
一、各檢索模型Ad Hoc 檢索結果比較
二、各檢索模型增加QE 檢索結果與比較
第伍章 互動式資訊檢索評估設計
第一節 操作程序
第二節 資料前處理
第三節 互動式資訊檢索介面
第四節 測試程序與受測者
第五節 後測問卷
第六節 評分方法
第陸章 評估結果
第一節 各模型比較結果
一、無使用字詞推薦
二、有使用字字詞推薦
第二節 是否使用字詞推薦比較結果
一、各模型是否使用字詞推薦比較結果
二、整體是否使用字詞推薦結果比較
第三節 受測者操作行為
一、 各模型之搜尋行為比較
第四節 後測問卷分析
一、 介面易用性
二、 介面有效性
三、 滿意度
第柒章 結論
第一節 結論與研究貢獻
第二節 研究限制與未來展望
參考文獻
附錄一 MOZILLA PUBLIC LICENSE
附錄二 前測問卷內容

1.Manning, C.-D.,Raghavan, P & Schutze, H,王斌譯,資訊檢索導論,五南圖書出版股份有限公司,2012。
2.江玉婷 & 陳光華,TREC現況及其對資訊檢索研究之影響The TREC and Its Impact on IR Researches,政治大學圖書與資訊學刊,1999,頁36-59。
3.陳光華,資訊檢索的績效評估 Performance Evaluations for Information Retrieval,2004年現代資訊貰織與檢索研討會論文集,2004,頁125-139。
4.陳光華 & 莊雅蓁,資訊檢索之中文詞彙擴展Expansion of Chinese Words in Information Retrieval,資訊傳播與圖書館學,2001,頁59-75。
5.Ahmed, A., Jim, C. & Hamdy, S.-S., Improving query precision using semantic expansion, Information Processing and Management, 2007, pp.705-716.
6.Amati, G., Frequentist and bayesian approach to information retrieval. In Advances in information retrieval, 28th European conference on IR research, 2006, pp.13–24.
7.Amati, G. & Van Rijsbergen, C.-J., Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans.Inf. Syst, 2002, pp.357–389.
8.Ben, H. & Iadh, O, Combining fields for query expansion and adaptive query expansion, Information Processing and Management, 2007, pp.1294–1307.
9.Choi, Y., Query Modification Patterns and Concept Analysis of Web Image Queries, Journal of the American Society for Information Science and Technology, 2012, pp.1-3.
10.Center for Intelligent Information Retrieval, The Lemur Toolkit, 2010. Retrieved 2013/11/10 from http://www.lemurproject.org/lemur.php.
11.Gloria ,B., Alessandro, C., Giuseppe, P. & Stefania, R , Disambiguated query suggestions and personalized content-similarity and novelty ranking of clustered results to optimize web searches, Information Processing and Management, 2012, pp.419–437.
12.He, D. & Wu, D, Enhancing query translation with relevance feedback in translinqual information retrieval, Information Processing and Management, 2010, pp.1-17.
13.Maron, M.-E. & Kuhns, J.-L., On Relevance, Probabilistic Indexing and Information Retrieval, Journal of the ACM, 1960, pp.216-244.
14.Na, K. & Park, M.-S., Searcher’s Perceptions for Query Reformulation Behavior on the Web, Journal of the American Society for Information Science and Technology,2012,pp.1-4.
15.National Institute of Standards and Technology, Data - English Relevance Judgements File List, 2013. Retrieved 2013/11/10 from http://trec.nist.gov/data/qrels_eng/.
16.Niu, X & Hemminger, B.-M., Effectiveness of Real-Time Query Expansion in a Library Catalog, Journal of the American Society for Information Science and Technology, 2011, pp.1-5.
17.Ounis, I, Amati, G, Plachouras, V, He, B, Macdonald, C & Lioma, C, Terrier: A High Performance and Scalable Information Retrieval Platform, ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR 2006) 10th August, 2006, pp.18-25.
18.Ponte, J.-M. & Croft, W.-B., A language modeling approach to information retrieval, 21st ACM Conference on Research and Development in Information Retrieval (SIGIR’98), 1998, pp.275–281.
19.Robertson, S. E. & Parck Jones, K., Relevance weighting of search terms. Journal of the American Society for Information Science, 1976, pp.129-146.
20.Robertson, S. & Walker, S., Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval, 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.1994, 241.
21.Salton, G., The SMART retrieval system, Prentice Hal, 1971.
22.Salton, G. & Buckly, C., Term-Weighting Approaches in Automatic Text Retrieval, Information Processing and Management, 1988, pp.513-523.
23.Salton, G. & Buckley, C., Improving Retrieval Performance by Relevance Feedback, Journal of the American Society for Information Science, 1990, pp.288-297.
24.Terrier IR Team, Documentation for Terrier 3.5, 2011. Retrieved 2013/11/10 from the http://terrier.org/docs/v3.6/quickstart.html.
25.Voorhees, E. & Harman, D., Overview of the Eighth Text REtrieval Conference (TREC 8), National Institute of Standards and Technology, 1999, pp.131-150.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top