跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.110) 您好!臺灣時間:2026/05/04 13:02
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:許明燦
研究生(外文):Ming-Tsan Hus
論文名稱:智慧型研究資料階層分類搜尋系統之建置
論文名稱(外文):Development of Intelligent System for Hierarchical Searching of Research Materials
指導教授:林金玲林金玲引用關係應國卿應國卿引用關係
指導教授(外文):Jin-Jing LinKuo-Chung Ying
學位類別:碩士
校院名稱:華梵大學
系所名稱:工業管理學系碩士班
學門:工程學門
學類:工業工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:中文
論文頁數:73
中文關鍵詞:智慧搜尋文件分類資訊檢索
外文關鍵詞:Intelligent SearchingDocument ClassificationInformation Retrieval
相關次數:
  • 被引用被引用:1
  • 點閱點閱:255
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
論文研究即以簡化研究人員收集研究資料的程序、及提高相關資料檢索結果的實用性與精確性為目標,建立一套以研究資料為對象的階層式分類搜尋系統,並以應用視窗的方式作為搜尋系統運作的場所。

論文研究採用兩階段式的運作模式。第一階段,根據使用者輸入的查詢關鍵字以及研究範圍的選擇、同義詞的選取及最小分群文件數的設定,透過搜尋引擎產生一文件集。第二階段,根據第一階段的文件集中的文件,透過改良式輕量文件分群法,將文件集中的文件分群,配合系統的索引詞,以達到階層的資料型態。

系統讓使用者限定資料集的大小,省去瀏覽太多不必要的資訊的時間。加上研究領域的規劃,可以以更友善的資料呈現方式呈現給使用者來檢索。除此之外系統所提供異動條件之功能,可不斷的增加或修正系統的條件或分群設定,藉由不斷的訓練,或是專家的幫助,可使系統的搜尋結果,更符合使用者需要。
This paper focuses on the problem of developing an intelligent system of hierarchical searching for research materials, which can be done by an intelligent searching method and document classifier. The system considers searching objects that scattered in the Internet and then the searching outcomes automatically generate a well organized report with hierarchical classifications when the searching is complete.

For the practicality of searching results, an indexing mechanism and a synonym database from research fields are formed before material searching. Then searching is done among related research sites. Finally, Lightweight Document Clustering approach is applied into these information, gathered from the intelligent search engine, to have the hierarchically organized reports.

The proposed system is processed by following steps. First of all, the system fetches the related information from indexing and synonym database in order to perform the intelligent searching. Secondly, a Spider/Crawler retrieves information from Internet based on the information, which is found at the first stage, and related research sites. Thirdly, an intelligent document clustering is applied to form the hierarchical classification of documents. Finally, a feedback system is developed in order to update the indexing database in real time based on the outcomes of document clustering.

Following the algorithm, a set of computer programs and a web site was developed. The performance of proposed system is evaluated by the accuracy of searching results, which means searching results matches exactly what users want. From the simulation results, the proposed method is shown as a highly efficient information-filter. And the representation of hierarchical documents based on clustering is really practical method to present the searching results.
目錄
誌謝 I
摘要 II
Abstract III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章 緒論 1
1-1 研究動機 1
1-2 研究目的 2
1-3 研究內容與方法 4
1-4 研究流程 5
第二章 文獻回顧 6
2-1資訊檢索( Information Retrieval ) 6
2-2-1直接查詢全文 8
2-1-2不直接查詢全文 - 單一文獻 9
2-1-3不直接查詢全文- 一群文獻 14
2-2資訊檢索系統模型 14
2-2-1布林模型 (Boolean Model) 15
2-2-2 向量模型 (Vector Model) 17
2-2-3 機率模型 (Probabilistic Model) 21
2-3 搜尋引擎 (Search Engine) 23
2-4 文件分群 28
2-4-1 Hierarchical 29
2-4-2 Partitional 34
2-5 檢索系統效能評估 41
第三章 問題定義 44
第四章 演算法分析與系統建置 48
4-1起始資料設定及產生文件集 50
4-2 分群演算法 55
第五章 模擬與評估 58
5-1 模擬例題與環境 61
5-2模擬結果分析 63
第六章 結論與建議 67
6-1 結論 67
6-2 建議與未來研究 68
參考文獻 69
參考文獻
[1]林厚誼,2000,”犯罪手法及失贓證物相似之竊盜犯罪案件資訊檢索”,私立東海大學,資料科學學系碩士論文。
[2]林佳慧,2000,”自由語文資訊檢索資料庫系統績效評估工作量模式之研究-以傳播學網際網路資料庫為例”,國立政治大學,資訊管理學系,碩士論文。
[3]陳麴合,2001,”超連結與關鍵字頻分析之搜尋引擎研究”,國立屏東科技大學,資訊管理系碩士論文。
[4]張如瑩,2001, “多語系平行關鍵頁搜尋引擎之設計與建構”, 私立元智大學,資訊管理學系碩士論文。
[5]謝清俊,1988年4月,全文資料庫專輯科學月刊,第十九卷,第四期,頁262-267。
[6]卜小蝶,1996,圖書資訊檢索技術,文華圖書,台北。
[7]黃思瑋,2003,”平行搜尋引擎於蛋白質交互作用文獻之應用”,私立元智大學,資訊管理學系,碩士論文。
[8]劉哲民,1997,“在文字庫中使用圖形模式萃取知識之研究”,私立元智大學,電機與資訊工程研究所碩士論文。
[9]許加文、李錫捷,1998,“網際網路搜尋過濾系統[一個「關鍵頁」超搜尋智慧型代理引擎”,第九屆國際資訊管理學術研討會,中壢,台灣。
[10]徐碧玉,2001,”利用熵作為網際網路文件搜尋之排名方法”,國立中興大學,資訊科學學系碩士論文。
[11]陳皙彥,2003,”一個有效的文件檢索索引結構-關鍵詞繼承結構”,私立朝陽科技大學,資料管理學系碩士論文。
[12]謝儒誠,2001,”資料探勘技術運用於文件自動分群之研究”,中央警察大學,資訊管理研究所碩士論文。
[13]江季洲,2002,”以分群為基礎的資訊呈現”,國立台灣科技大學,資訊管理學系研究所碩士論文。
[14]林正芳,2002,”以重力理論為基礎的二階段階層式資料分群演算法特性之研究”,國立台灣大學,資訊工程研究所碩士論文。
[15]陳致偉,2002,”以影響函數為基礎之階層式分群演算法”, 國立台灣大學,資訊工程研究所碩士論文。
[16]林隆祺,2000,”運用字詞位置的文件檢索技術初探”,國立台灣大學,資訊管理學研究所碩士論文。
[17]殷欣靖,2001,”以文件為基礎的資訊擷取系統”,國立台灣科技大學,資訊工程學研究所碩士論文。
[18]楊敦淇,2003,”應用相關資訊回饋於貝氏混合機率檢索模型”,國立成功大學,資訊工程學系碩士論文。
[19]Faloutsos, C.,“Access Methods for text”, ACM Computing Surveys, March 1985, pp. 49-74.
[20]Nicholas J.Belkin and W.Bruce Croft, 1992 ,”Information Filtering and Information Retrieval : Two Sides of the same Coin ? ” Communication of the ACM , Vol.35 , No.12,Dec.1992.
[21]Richard C. Bodner and Fei Song, "Knowledge-Based Approaches to QueryExpansion in Information Retrieval” In Advances in Artificial Intelligence (New York: Springer, 1996): 147.
[22]Sullivan, C., Web developer.com guide to search engines, John Wiley &Sons, 1998, pp17-52.
[23]Kingoff, Comparing Internet search engines, Computer, Vol. 30 4,April 1997, pp. 117-118.
[24]Thomas, B., URL diving, IEEE Internet Computing, Volume: 2 3,May-June 1998, pp.92-93.
[25]Chen, Yangjun, Signature files and signature trees, Information Processing Letters , 82(4):213-221 May 31, 2002.
[26]University of California, Berkeley, Meta-search engines, http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/MetaSearch.html,April, 2000.
[27]Ellsworth, J. H., Working the net: time to consider a web search specialist, netWorker 3 3,Sep. 1999, pp.17-19.
[28]Ikeji, A. C. and Fotouhi, F., An adaptive real-time web search engine,Proceedings of the Second International Workshop on Web Information and Data Management, 1999, pp.12-16.
[29]Chien, L. F. and Pu, H. T., Important issues on Chinese information retrieval, Computational Linguistics and Chinese Language Processing, Vol.1, No.1, August 1996, pp. 205-221
[30]Yu, C., Meng, W., Liu, K. L., Wu, W. S. and Rishe, N., Efficient and effective metasearch for a large number of text databases, Proceedings of the Eighth International Conference on Information Knowledge Management, 1999, pp.217-224.
[31]Dreilinger, D. and Howe, A. E., Experiences with selecting search engines using metasearch, ACM Trans. Inf. Syst. 15, 3 ,Jul. 1997, pp. 195-222.
[32]Glover, E. J., Lawrence, S., Birmingham, W. P. and Giles, C. L., Architecture of a metasearch engine that supports user information needs,Proceedings of the Eighth International Conference on Information Knowledge Management, 1999, pp.210-216.
[33]Du, D.H.-C., Ghanta, S., Maly, K.J. and Sharrock, S.M., June 1989, An efficient file structure for document retrieval in the automated office environment, Knowledge and Data Engineering, IEEE Transactions on ,1(2): 258 -273.
[34]Faloutsos, C. and Oard, D. W. , August 1995, "A Survey of Information Retrieval and Filtering Methods", University of Maryland, Technical Report CS-TR-3514.
[35]Zobel, J., Moffat, A. and Ramamohanarao, K. , December 1998, Inverted files versus signature files for text indexing, ACM Transactions on Database Systems (TODS), 23(4): 453-490
[36]Robertson, S. E. and K. Sparck Jones. , “Relevance weighting of search terms,” Journal of the American Society for Information Sciences, pp. 129 - 146, 1976.
[37]A.E1-Hamdouchi and P. Willet, “Comparison of Hierarchic Agglomerative Clustering Methods for Document Retrieval”, The computer journal, Vol. 32, No. 3, 1989
[38]Voorhees, E. M. “Implementing Agglomerative Hierarchical Clustering Algorithms for Use in Document Retrieval”, Information Processing and Management, 22: 465-476, 1986
[39]Willet, P. “Recent Trends in Hierarchical Document Clustering: A Critical Review”, Information Processing and Management, 24:577-597, 1988
[40]Bradley, P. and U. Fayyad, “Refining Initial Points for K-Means Clustering ”, Proceedings of the 15th International Conference on Machine Learning ICML98, pages 91-99, Morgan Kaufmann, San Francisco, 1998.
[41]Hartigan, J. and M. Wong, “A K-Means Clustering Algorithm”, Applied Statistics, 1979.
[42]Leouski, A. V. and W. B. Croft, “An Evaluation of Techniques for Clustering Search Results”, Technical Report IR-76, Department of Computer Science. University of Massachusetts, Amherst, 1996.
[43]Cutting, D. R. D. R. Karger, J. O. Pedersen, and J. W. Tukey, “Scatter/ Gather: A Cluster-Based Approach to Browsing Large Document Collections”, in Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 318-329, 1992.
[44]Boley, D. Gini, M. Gross, R. Han, E.H. Hastings, K. Karypis, G. Kumar, V. Mobasher B. and Moore J. (1999). Partitioning-Based Clustering for Web Document Categorization , Decision Support Systems Journal.
[45]Bellot P. and El-Beze M. (1999). A clustering method for information retrieval, Technical Report IR-0199. Laboratoire d’Informatique d’Avignon.
[46]Inderjit S. Dhillon and Dharmendra S. Modha (1999). “A Data-clustering Algorithm on Distributed Memory Multiprocessors,” Proceedings of Large-scale Parallel KDD Systems Workshop, ACM SIGKDD, August 15-18. (also appears Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, Volume 1759, pp. 245-260, 2000)
[47]Zamir O. and O. Etzioni, “Grouper: A Dynamic Clustering Interface to Web Search Results” , in Proceedings of the 8th International World Wide Web Conference(WWW8),1999.
[48]Zervas G. and S. M. Ruger. “The Curse of Dimensionality and Document Clustering”, in Proceedings of the IEE Searching for information: AI and IR Approaches, 1999.
[49]R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval”, Addison Wesley,1999.
[50]Weiss, S. B. White, C. Apte, and F. Damerau. “Lightweight Document Matching for Help-Desk Applications” IEEE Intelligent System, page 1-5, 2000.
[51]Shaw, W. M., Burgin, R. and Howell, P., Performance standards and evaluations in IR test collections: Cluster-based retrieval models,Information Processing & Management, 33(1):1-14, 1997.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top