跳到主要內容

臺灣博碩士論文加值系統

(2600:1f28:365:80b0:1fb:e713:2b67:6e79) 您好!臺灣時間:2024/12/12 16:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李建明
研究生(外文):Jing-Ming Li
論文名稱:多種資料庫搜尋介面之整合
論文名稱(外文):Integrated Database Searching Interfaces
指導教授:張佑康
指導教授(外文):Yukon Chang
學位類別:碩士
校院名稱:義守大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2000
畢業學年度:88
語文別:英文
論文頁數:61
中文關鍵詞:資料擷取網頁擷取擷取時間
外文關鍵詞:data retrievalweb page retrievalretrieval time
相關次數:
  • 被引用被引用:0
  • 點閱點閱:901
  • 評分評分:
  • 下載下載:57
  • 收藏至我的研究室書目清單書目收藏:0
近年來,由於HTML顯示方式的多樣化以及容易使用使得他成為Internet上資訊傳播的主流。但是網路資料量的成長非常迅速[9],因此如何快速的從Internet找出需要的資料就變得很重要。後來由於搜尋引擎的發明,大量的縮短了人們在資料搜集所花費的時間。
HTML除了被廣泛的使用於Internet上資料的傳播外,也被運用於學校圖書館中Journal或Conference 文章的查詢結果顯示。而Internet 上的搜尋引擎[1]與圖書館的資料庫搜尋工具最大的不同點在於圖書館所提供的資料已經過適當的處理以及歸類。不過,雖然圖書館資料庫已經過適當的歸類,但是因為不同的資料庫是由不同的廠商所提供,所以每一個資料庫介面所提供的介面和搜尋結果的顯示也有所差異。再者,當使用者在做與研究相關的資料搜尋時,通常不會只侷限於單一資料庫的查詢,而是大範圍的搜尋所有相關資料。因此,整合搜尋介面對於圖書館的使用者而言可以提供相當多的幫助。
在此篇論文中,我們對於圖書館資料庫的搜尋提出了一個整合性的架構,在此架構中,首先我們將不同的資料庫搜尋介面整合成單一介面。因此當使用者開始搜尋時,程式就代替使用者同時對被選擇的資料庫做搜尋。在搜尋過程中,我們也對每個資料庫的proxy server和認證作處理。在收到由資料庫所傳回的查詢結果後,我們再利用pattern將相關欄位的資訊給萃取出來,並顯示結果於單一畫面。
在我們的論文中,我們也比較了原來的搜尋時間與我們的方法的搜尋時間,從比較結果中我們知道,我們所提出的方法有較好的效率並且可以讓使用者更充分的利用搜尋的時間。
In recent years, HTML has become the primary vehicle for information dissemination on the Internet because of its presentation power and ease of use. But since the amount of information on the Internet is growing rapidly, locating information has become an important issue [9]. Subsequently, search engines [1] were invented to facilitate search and to shorten retrieval time on Internet.
HTML has also been used to search online databases for full text or abstract of papers in journals and conferences. The difference between Internet search engines and library databases search tools is whether the data is properly processed and well-structured. Although the data provided in library databases is well structured, databases provided by different vendors have different search interfaces and different ways to present the search results. Furthermore, when a user searches for research-oriented material, the probability of restricting search in only one particular database is small. User usually wants to find out about relevant material distributed over different databases. An integrated search interface is thus beneficial to library users.
In this thesis, we propose a new integrated framework for library databases search. In the new framework, we first integrate the search interfaces to various databases to present a unified search entry for the end user. After the user starts a search, we run simultaneous search on user behalf in each of the selected databases. This step also takes care of any necessary authentication and/or proxy arrangement as required by individual database vendor. The HTML reply messages from databases are then put through a pattern discovery process to extract result of the search and combined into a single frame.
In our thesis, we also compare the retrieval time in our approaches with the original. From the result we see that our approaches have higher performance, allowing user more time in exploring search results.
AcknowledgementsI
中文摘要 II
AbstractIV
Table of ContentsVI
List of FiguresVIII
List of TablesIX
Chapter 1 Introduction1
1.1 WWW on Internet1
1.2 Library on WWW2
1.3 Approaches3
1.4 Research Objective4
1.5 Organization5
Chapter 2 Infrastructure of WWW6
2.1 Concept of HTTP6
2.2 HTTP Message6
2.2.1 Message Types7
2.2.2 Message Herders7
2.2.3 Message Body8
2.4 Authentication11
2.5 Proxy Server13
Chapter 3 Concept of information extraction14
3.1 Introduction to information extraction14
3.2 Extracting the information15
3.3 PAT Trees17
Chapter 4 Approach21
4.1 Integrating search flow21
4.2 Information extraction25
4.2.1 Pattern description25
4.2.2 Extract process27
4.3 Result layout30
4.4 Architecture33
Chapter 5 Experiment result and contribution35
Chapter 6: Conclusion42
Reference43
Appendix45
[1] Kingoff, A., "Comparing Internet search engines," Computer Volume: 30 4 , April 1997 , Page(s): 117 -118.
[2] Sibel Adali, Corey Bufi and Yaowadee Temtanapat, "Integrated Search Engine," IEEE Knowledge and Data Engineering Exchang Workshop Proceedings, 1997.
[3] Sudhir Aggarwal, Fuyng Hung and Weiyi Meng, "WIRE - A WWW-based Information Retrieval and Extraction System," Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on , 1998 , Page(s): 887 -892.
[4] Hsinchun Chan, Yi-Ming Chung, Marshall Ramsey, Christopher C. Yang, Pai-Chun Ma and Jerome Yen, "Intelligent Spider for Internet Searching," IEEE Internet Computing, 1997.
[5] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee, "Hypertext Transfer Protocol HTTP/1.1," Network Working Group RFC 2068, January 1997.
[6] Chia Hui Chang, Chun Nan Hsu, "Automatic Extraction of Information Blocks Using PAT Trees," NCS ''99.
[7] N. Kushmeric. D. Weld, and R. Doorenbos, "Wrapper induction for information extracton," In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IKCAI). 1997.
[8] Steve Lawrence, C. Lee Giles, "Context and Page Analysis for Improved Web Search," IEEE Internet Computing, August 1998.
[9] William R. Tuthill, Wais Inc, "Don''t Get Caught in the Web: A Fieldguide to Searching the Net," Proceedings of COMPCON, 1996.
[10] Weiguang Shao, Wei-Tek Tsai, Sanjai Rayadurgam and Robert Lai, "An Agent Architecture for Supporting Individualized Services in Internet Application," IEEE, 1998.
[11] Sun Wu and Chang-Chain Liao, "Virtual Proxy Servers for WWW and Intelligent Agents on the Internet," IEEE System Sciences Proceedings, 1997.
[12] May, W., "An integrated architecture for exploring, wrapping, mediating and restructuring information from the Web," Database Conference, 2000. ADC 2000. Proceedings. 11th Australasian , 1999 , Page(s): 82 -89
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top