

(2600:1f28:365:80b0:1fb:e713:2b67:6e79) 您好!臺灣時間:2024/12/12 16:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Jing-Ming Li
論文名稱(外文):Integrated Database Searching Interfaces
指導教授(外文):Yukon Chang
外文關鍵詞:data retrievalweb page retrievalretrieval time
  • 被引用被引用:0
  • 點閱點閱:901
  • 評分評分:
  • 下載下載:57
  • 收藏至我的研究室書目清單書目收藏:0
HTML除了被廣泛的使用於Internet上資料的傳播外,也被運用於學校圖書館中Journal或Conference 文章的查詢結果顯示。而Internet 上的搜尋引擎[1]與圖書館的資料庫搜尋工具最大的不同點在於圖書館所提供的資料已經過適當的處理以及歸類。不過,雖然圖書館資料庫已經過適當的歸類,但是因為不同的資料庫是由不同的廠商所提供,所以每一個資料庫介面所提供的介面和搜尋結果的顯示也有所差異。再者,當使用者在做與研究相關的資料搜尋時,通常不會只侷限於單一資料庫的查詢,而是大範圍的搜尋所有相關資料。因此,整合搜尋介面對於圖書館的使用者而言可以提供相當多的幫助。
在此篇論文中,我們對於圖書館資料庫的搜尋提出了一個整合性的架構,在此架構中,首先我們將不同的資料庫搜尋介面整合成單一介面。因此當使用者開始搜尋時,程式就代替使用者同時對被選擇的資料庫做搜尋。在搜尋過程中,我們也對每個資料庫的proxy server和認證作處理。在收到由資料庫所傳回的查詢結果後,我們再利用pattern將相關欄位的資訊給萃取出來,並顯示結果於單一畫面。
In recent years, HTML has become the primary vehicle for information dissemination on the Internet because of its presentation power and ease of use. But since the amount of information on the Internet is growing rapidly, locating information has become an important issue [9]. Subsequently, search engines [1] were invented to facilitate search and to shorten retrieval time on Internet.
HTML has also been used to search online databases for full text or abstract of papers in journals and conferences. The difference between Internet search engines and library databases search tools is whether the data is properly processed and well-structured. Although the data provided in library databases is well structured, databases provided by different vendors have different search interfaces and different ways to present the search results. Furthermore, when a user searches for research-oriented material, the probability of restricting search in only one particular database is small. User usually wants to find out about relevant material distributed over different databases. An integrated search interface is thus beneficial to library users.
In this thesis, we propose a new integrated framework for library databases search. In the new framework, we first integrate the search interfaces to various databases to present a unified search entry for the end user. After the user starts a search, we run simultaneous search on user behalf in each of the selected databases. This step also takes care of any necessary authentication and/or proxy arrangement as required by individual database vendor. The HTML reply messages from databases are then put through a pattern discovery process to extract result of the search and combined into a single frame.
In our thesis, we also compare the retrieval time in our approaches with the original. From the result we see that our approaches have higher performance, allowing user more time in exploring search results.
中文摘要 II
Table of ContentsVI
List of FiguresVIII
List of TablesIX
Chapter 1 Introduction1
1.1 WWW on Internet1
1.2 Library on WWW2
1.3 Approaches3
1.4 Research Objective4
1.5 Organization5
Chapter 2 Infrastructure of WWW6
2.1 Concept of HTTP6
2.2 HTTP Message6
2.2.1 Message Types7
2.2.2 Message Herders7
2.2.3 Message Body8
2.4 Authentication11
2.5 Proxy Server13
Chapter 3 Concept of information extraction14
3.1 Introduction to information extraction14
3.2 Extracting the information15
3.3 PAT Trees17
Chapter 4 Approach21
4.1 Integrating search flow21
4.2 Information extraction25
4.2.1 Pattern description25
4.2.2 Extract process27
4.3 Result layout30
4.4 Architecture33
Chapter 5 Experiment result and contribution35
Chapter 6: Conclusion42
[1] Kingoff, A., "Comparing Internet search engines," Computer Volume: 30 4 , April 1997 , Page(s): 117 -118.
[2] Sibel Adali, Corey Bufi and Yaowadee Temtanapat, "Integrated Search Engine," IEEE Knowledge and Data Engineering Exchang Workshop Proceedings, 1997.
[3] Sudhir Aggarwal, Fuyng Hung and Weiyi Meng, "WIRE - A WWW-based Information Retrieval and Extraction System," Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on , 1998 , Page(s): 887 -892.
[4] Hsinchun Chan, Yi-Ming Chung, Marshall Ramsey, Christopher C. Yang, Pai-Chun Ma and Jerome Yen, "Intelligent Spider for Internet Searching," IEEE Internet Computing, 1997.
[5] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee, "Hypertext Transfer Protocol HTTP/1.1," Network Working Group RFC 2068, January 1997.
[6] Chia Hui Chang, Chun Nan Hsu, "Automatic Extraction of Information Blocks Using PAT Trees," NCS ''99.
[7] N. Kushmeric. D. Weld, and R. Doorenbos, "Wrapper induction for information extracton," In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IKCAI). 1997.
[8] Steve Lawrence, C. Lee Giles, "Context and Page Analysis for Improved Web Search," IEEE Internet Computing, August 1998.
[9] William R. Tuthill, Wais Inc, "Don''t Get Caught in the Web: A Fieldguide to Searching the Net," Proceedings of COMPCON, 1996.
[10] Weiguang Shao, Wei-Tek Tsai, Sanjai Rayadurgam and Robert Lai, "An Agent Architecture for Supporting Individualized Services in Internet Application," IEEE, 1998.
[11] Sun Wu and Chang-Chain Liao, "Virtual Proxy Servers for WWW and Intelligent Agents on the Internet," IEEE System Sciences Proceedings, 1997.
[12] May, W., "An integrated architecture for exploring, wrapping, mediating and restructuring information from the Web," Database Conference, 2000. ADC 2000. Proceedings. 11th Australasian , 1999 , Page(s): 82 -89
第一頁 上一頁 下一頁 最後一頁 top