跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.88) 您好!臺灣時間:2026/02/15 23:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:孫智梁
研究生(外文):Zhi-Liang Sun
論文名稱:利用語意分析於相關回饋以進行查詢擴展之方法
論文名稱(外文):The application of semantic analysis in relevance feedback for query expansion
指導教授:周世傑周世傑引用關係
指導教授(外文):Shih-Chieh Chou
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:中文
論文頁數:69
中文關鍵詞:資訊檢索相關回饋查詢擴展語意分析Word2Vec
外文關鍵詞:Information RetrievalRelevance FeedbackQuery ExpansionSemantic AnalysisWord2Vec
相關次數:
  • 被引用被引用:0
  • 點閱點閱:334
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
現今我們處於資訊爆炸的時代,在面臨龐大資料量時,如何有效率地獲取所需資訊是一個非常重要的課題,而資訊檢索 (Information Retrieval) 系統也就成為人們在篩選資料時最常用的工具之一。在相關回饋 (Relevance Feedback) 領域中,Rocchio演算法最廣為人知,該演算法藉由分析相關文件字詞及非相關文件字詞出現頻率,來產生新的查詢字詞,並加入到查詢擴展 (Query Expansion) 集合中,不過Rocchio僅以頻率之角度判斷,並未考量字詞間其他可以利用的資訊。近年來陸續也有語意搜索的研究被提出,概念為發掘字詞間隱含的語意關係,因此,本研究以使用者的原始查詢和查詢結果作為基礎,主要利用神經網路模型Word2Vec來分析原始查詢以及相關回饋中字詞間的語意資訊,並結合共現性分析,萃取出適合的相關字詞來擴展原始查詢字詞集合,使查詢關鍵字能夠更貼近使用者需求。最後透過實驗證明,本研究所提出之方法相較於其他方法能有較佳的檢索效果。
In an era of information explosion, to obtain the information efficiently is a very important issue when faced with huge data volume, and the information retrieval system has become one of the most commonly used tools. In the field of relevance feedback, Rocchio’s query expansion is a well-known method. The algorithm generates new query terms by analyzing the frequency of terms which residing in relevance documents and non-relevance documents. However, Rocchio’s method only focuses on term frequency and ignores information between terms. In recent years, the idea of semantic search is getting more and more popular. Therefore, based on the user's original query and search results, our research uses Word2Vec which is a neural network model to analyze the semantic information between the original query and the relevance feedback, and combine the co-occurrence analysis to extract the appropriate query expansion terms. The results of experiments verify that the proposed method is effective in document retrieval.
中文摘要 i
英文摘要 ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
一、 緒論 1
1-1 研究背景與動機 1
1-2 研究目的 2
1-3 研究範圍及限制 2
1-4 論文架構 3
二、 文獻探討 4
2-1 相關回饋 (Relevance Feedback) 4
2-1-1相關回饋背景與應用 4
2-1-2 Rocchio演算法 6
2-2 查詢擴展 (Query Expansion) 8
2-2-1 局部查詢擴展 (Local Query Expansion) 9
2-2-2 全域查詢擴展 (Global Query Expansion) 9
2-3 正規化Google距離 (Normalized Google Distance) 10
2-4 語詞資訊應用方法 11
2-5 Word2Vec 12
三、 研究方法 15
3-1 系統架構 15
3-2 方法設計 16
3-2-1原始查詢結果處理 17
3-2-2相關字詞之間的語意資訊處理 17
3-2-3原始查詢字詞語意資訊處理 18
3-2-4相關字詞之間的共現分析處理 20
四、 實驗設計 22
4-1 實驗資料 22
4-2 實驗評估指標 26
4-3 實驗流程 29
4-3-1 實驗一 30
4-3-2 實驗二 31
4-4 實驗結果 32
4-4-1 實驗一結果 32
4-4-2 實驗二結果 39
4-5 實驗結果討論 47
五、 結論 50
5-1 結論與貢獻 50
5-2 未來研究方向 51
參考文獻 52
[1] Furnas, G.W., Landauer, T.K., Gomez, L.M., and Dumais, S.T. (1987). The vocabulary problem in human-system communication. Communications of the ACM, 30(11): p. 964-971.
[2] Salton, G. and McGill, M.J. (1983). Introduction to modern information retrieval.
[3] Rocchio, J.J. (1971). Relevance feedback in information retrieval. The SMART retrieval system: experiments in automatic document processing: p. 313-323.
[4] Lin, Y.-S. (2015). The application of the term information residing in relevance feedback for query expansion (Master's thesis). National Central University
[5] Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[6] Bhogal, J., MacFarlane, A., and Smith, P. (2007). A review of ontology based query expansion. Information processing & management, 43(4): p. 866-886.
[7] Salton, G. (1971). The SMART retrieval system—experiments in automatic document processing.
[8] Dillon, M. and Desper, J. (1980). The use of automatic relevance feedback in Boolean retrieval systems. Journal of Documentation, 36(3): p. 197-208.
[9] Robertson, S.E., van Rijsbergen, C.J., and Porter, M.F. (1980). Probabilistic models of indexing and searching. in Proceedings of the 3rd annual ACM conference on Research and development in information retrieval. Butterworth & Co.
[10] Rui, Y., Huang, T.S., Ortega, M., and Mehrotra, S. (1998). Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Transactions on circuits and systems for video technology, 8(5): p. 644-655.
[11] Buckley, C. and Salton, G. (1995). Optimization of relevance feedback weights. in Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
[12] Grigorova, A., De Natale, F.G., Dagli, C., and Huang, T.S. (2007). Content-based image retrieval by feature adaptation and relevance feedback. IEEE transactions on multimedia, 9(6): p. 1183-1192.
[13] Yan, R., Hauptmann, A., and Jin, R. (2003). Multimedia search with pseudo-relevance feedback. in International Conference on Image and Video Retrieval. Springer.
[14] Kelly, D. and Belkin, N.J. (2001). Reading time, scrolling and interaction: exploring implicit sources of user preferences for relevance feedback. in Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
[15] Manning, C.D., Raghavan, P., and Schütze, H. (2008). Introduction to information retrieval. Vol. 1. 2008: Cambridge university press Cambridge.
[16] Vechtomova, O. and Wang, Y. (2006). A study of the effect of term proximity on query expansion. Journal of Information Science, 32(4): p. 324-333.
[17] Pinto, F.J. and Pérez-Sanjulián, C.F. (2008). Automatic query expansion and word sense disambiguation with long and short queries using WordNet under vector model. Actas de los Talleres de las Jornadas de Ingeniería del Software y Bases de Datos, 2(2): p. 17-23.
[18] Shi, Z., Gu, B., Popowich, F., and Sarkar, A. (2005). Synonym-based query expansion and boosting-based re-ranking: A two-phase approach for genomic information retrieval. in the Fourteenth Text REtrieval Conference (TREC 2005), NIST, Gaithersburg, MD.(October 2005).
[19] Araujo, L. and Pérez-Agüera, J.R. (2008). Improving query expansion with stemming terms: a new genetic algorithm approach. in European Conference on Evolutionary Computation in Combinatorial Optimization. Springer.
[20] Chen, Q., Li, M., and Zhou, M. (2007). Improving query spelling correction using web search results. in Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).
[21] Harman, D. (1992). Relevance feedback revisited. in Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
[22] Sihvonen, A. and Vakkari, P. (2004). Subject knowledge improves interactive query expansion assisted by a thesaurus. Journal of Documentation, 60(6): p. 673-690.
[23] Xu, J. and Croft, W.B. (1996). Query expansion using local and global document analysis. in Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
[24] Crouch, C.J. (1990). An approach to the automatic construction of global thesauri. Information Processing & Management, 26(5): p. 629-640.
[25] Cilibrasi, R.L. and Vitanyi, P.M. (2007). The google similarity distance. IEEE Transactions on knowledge and data engineering, 19(3).
[26] Wu, I.-C., Lin, Y.-S., and Liu, C.-H. (2011). An exploratory study of navigating wikipedia semantically: model and application. in International Conference on Online Communities and Social Computing. Springer.
[27] WorldWideWebSize. The size of the World Wide Web (The Internet). 2018 [cited 2018 30 June]; Available from: www.worldwidewebsize.com.
[28] Evangelista, A. and Kjos-Hanssen, B. (2009). Google distance between words. Frontiers in Undergraduate Research.
[29] Chen, P.-I. and Lin, S.-J. (2010). Automatic keyword prediction using Google similarity distance. Expert Systems with Applications, 37(3): p. 1928-1938.
[30] Handler, A. (2014). An empirical study of semantic similarity in WordNet and Word2Vec.
[31] Wikipedia. Wikipedia:Database download. 2018 [cited 2018 31 Mar]; Available from: https://en.wikipedia.org/wiki/Wikipedia:Database_download.
[32] Salton, G. and Lesk, M.E. (1968). Computer evaluation of indexing and text processing. Journal of the ACM (JACM), 15(1): p. 8-36.
[33] Chiang, Y.-T. and Chen, K.-H. (1999). The TREC and Its Impact on IR Researches. Journal of Library and Information Studies, (29): p. 36-59.
[34] Potts, K. (2007). Web design and marketing solutions for business websites. 2007: Apress.
[35] Davis, J. and Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. in Proceedings of the 23rd international conference on Machine learning. ACM.
[36] Everingham, M., Van Gool, L., Williams, C.K., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2): p. 303-338.
[37] Zhu, M. (2004). Recall, precision and average precision. Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, 2: p. 30.
[38] Turpin, A. and Scholer, F. (2006). User performance versus precision measures for simple search tasks. in Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊