跳到主要內容

臺灣博碩士論文加值系統

(2600:1f28:365:80b0:1fb:e713:2b67:6e79) 您好!臺灣時間:2024/12/12 15:15
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:蘇昱志
研究生(外文):Yu-chih Su
論文名稱:以本體知識庫為基礎擷取全球網資訊以建立語意網詮釋性資料
論文名稱(外文):Ontology-Driven Web Information Extraction for the Creation of Metadata in Semantic Web
指導教授:葉慶隆葉慶隆引用關係
指導教授(外文):Ching-Long Yeh
學位類別:碩士
校院名稱:大同大學
系所名稱:資訊工程學系(所)
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:英文
論文頁數:74
中文關鍵詞:資訊擷取語義網
外文關鍵詞:Information ExtractionSemantic web
相關次數:
  • 被引用被引用:0
  • 點閱點閱:212
  • 評分評分:
  • 下載下載:37
  • 收藏至我的研究室書目清單書目收藏:2
隨著網路上資訊爆炸的問題,人們需要一個有效的方法去擷取真正所需要資訊。語意網是在目前的全球網(WWW)之外,架設一層詮釋性資料層(metadata layer),用詮釋性資料描述全球網上的資源。語意網擴展目前的網站結構,在資訊方面給予意義上明確的定義,並且使得人和電腦可以共同合作處理資訊。
在這篇論文中,我們設計並且實作了一個用來擷取領域事件並且提供語義服務的系統。這個架構包含三個部分: 後端擷取元件,Ontology-based儲存庫以及服務前端。後端包含了數個用來擷取領域事件的元件。Ontology-based儲存庫是用來做為一個共通的介面,將所擷取出來的領域事件轉換成特定格式的資料,並且將這些轉換後的資料儲存到特定的儲存庫。服務前端則是提供了數個語義服務。在建構完整個系統之後,我們會評估我們的系統藉由擷取某個特定領域事件,並且探討那些原因會影響到擷取的結果。
With the problem of information explosion on the web, people need an efficient way to extract the information they really need. Semantic web is an emerging technology working by building a metadata layer upon the current web and using the metadata description language to describe the resources on the WWW. It is an extension of current Web where information is given well-defined meaning, better, enabling computers and people to process in cooperation.
In this thesis, we design and implement an system that is able to extract the domain events from a large number of relevant documents and to provide the semantic service. The architecture consists of three parts: Back End Extraction Components, Ontology-based store and Service Front End. The Back End consists of several components used to extract the domain events. The ontology-based store is served as a common interface which takes extracted domain events as input and exports the specific format data as output and provide specific repository for specific data format to store. The Service Front End provides several semantic services. After building the whole system, we make the evaluation for our system by extract some specific domain events from the relevant documents and figure out which reasons can influence the result of extraction.
TABLES OF CONTENTS
THESIS FOR MASTER OF SCIENCE I
ACKNOWLEDGMENTS IV
摘要 V
ABSTRACT VI
TABLES OF CONTENTS VII
LIST OF FIGURES IX
LIST OF TABLES X
CHAPTER 1 1
1.1 MOTIVATION AND GOAL 1
1.2 RESEARCH METHODOLOGY 3
1.3 OVERVIEW OF THE SYSTEM ARCHITECTURE 4
1.4 SCOPE OF THE THESIS 6
1.5 CONTRIBUTIONS 7
1.6 THESIS ORGANIZATION 8
CHAPTER 2 9
2.1 OVERVIEW OF THE SEMANTIC WEB 9
2.2 ONTOLOGY 10
2.3 THE DESCRIPTION LANGUAGE 10
2.3.1 XML 10
2.3.2 RDF 11
2.4 IE TECHNOLOGY 12
2.4.1 FASTUS 13
2.4.2 ANNIE 13
2.5 POS TAGGING 14
2.6 JAPE 16
CHAPTER 3 17
3.1 USE CASE DESCRIPTIONS 17
3.2 REQUIREMENTS ANALYSIS 18
3.2.1 Front-end functions 19
3.2.2 Back-end functions 21
3.2.3 Ontology based functions 25
CHAPTER 4 26
4.1 THE ESSENTIAL CONCEPTS OF BRILL TAGGER 26
4.2 TRAINING ALGORITHM OF THE BRILL TAGGER 30
4.3 EVALUATION 34
CHAPTER 5 36
5.1 INFORMATION EXTRACTION COMPONENTS 36
5.2 DOMAIN EVENTS EXTRACTION DESIGN 37
5.2.1 Identifier Preparation 37
5.2.2 Domain events matching 42
5.3 EXPORTATION 46
CHAPTER 6 47
6.1 INFORMATION EXTRACTION COMPONENTS 47
6.2 DOMAIN EVENTS EXTRACTION 48
6.2.1 Identifier Preparation 48
6.2.2 Domain Events Matching 51
6.3 DOMAIN EVENTS EXPORTATION 53
6.4 RDF TO DB STORE TRANSFORMATION 54
6.5 FRONT END SERVICES 55
6.6 EVALUATION 59
CHAPTER 7 61
7.1 CONCLUSION 61
7.2 FUTURE WORK 61
BIBLIOGRAPHY 63
[1]The web site of World Wide Web Consortium (W3C), http://www.w3.org/.
[2]Brian McBride, Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C Recommendation, http://www.w3.org/TR/rdf-concepts/, 10 February 2004.
[3]Google, http://www.google.com
[4]Yahoo, http://tw.yahoo.com
[5]Annotea Project, http://www.w3.org/2001/Annotea/.
[6] Bob Balzer, Neil Goldman and Marcelo Tallis, Semantic Markup Plug-In for MS Internet Explorer, http://annotation.semanticweb.org/Members/lago/semanticmarkupplug-informsinternetexplorer
[7]Sean B. Palmer, The Semantic Web: An Introduction,
http://infomesh.net/2001/swintro/, 2001.
[8]Jerry R. Hobbs, Douglas Appelt, John Bear, David Israel, Megumi Kameyama, Mark Stickel, and Mabry Tyson, "FASTUS: Extracting Information from Natural-Language Texts", in Finite State Devices for Natural Language Processing, E. Roche and Y. Schabes (eds.), MIT Press, 1996.
[9]H. Cunningham, D. Maynard, K. Bontcheva, V. Tablan. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02). Philadelphia, July 2002.
[10]Deborah L. McGuinness and Frank van Harmelen, OWL Web Ontology Language Overview, W3C Recommendation, http://www.w3.org/TR/owl-features/, 10 February 2004.
[11]N. F. Noy, M. Sintek, S. Decker, M. Crubezy, R. W. Fergerson, & M. A. Musen. Creating Semantic Web Contents with Protege-2000. IEEE Intelligent Systems 16(2):60-71, 2001.
[12]Clive Spenser, Flex Tutorial, http://www.lpa.co.uk/ind pro.htm, 2002.
[13]Tim Berners-Lee, Semantic Web - presented at XML 2000 conference,
http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html.
[14]Natalya F. Noy and Deborah L. McGuinness, Ontology Development 101: A Guide
to Creating Your First Ontology, Stanford Medical Informatics Technical Report, 2001.
[15]Hamish Cunningham, Information Extraction - a User Guide (Second Edition), Institute for Language, Speech and Hearing (ILASH), and Department of Computer Science University of Sheffield, UK, April 1999.
[16]Jurafsky,Daniel and James H. Martin, Speech and Language Processing: an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Section 8.4, Prentice Hall, 2000.
[17]Jurafsky,Daniel and James H. Martin, Speech and Language Processing: an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Section 8.5, Prentice Hall, 2000.
[18]Jurafsky,Daniel and James H. Martin, Speech and Language Processing: an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Section 8.6, Prentice Hall, 2000.
[19]Eric Brill, Transformation-based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging, The Johns Hopkins University.
[20]Hamish Cunningham et al., Developing Language Processing Components with GATE Version 3 (a User Guide), the University of Sheffield 2001-2004
[21]Sesame, http://www.openrdf.org/.
[22]Apache Tomcat, http://jakarta.apache.org/tomcat/index.html.
[23]Chang, Li-ping and Keh-jiann Chen, 1995. The CKIP Part-of-speech Tagging System for Modern Chinese Texts.Proceedings ofICCPOL'95, Hawaii.
[24]Academia Sinica Balanced Corpus of Modern Chinese, http://www.sinica.edu.tw/SinicaCorpus/.
[25]MySql, http://www.mysql.com/.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top