(3.236.231.61) 您好!臺灣時間:2021/05/15 22:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:郭育旻
研究生(外文):Yu-Min Kuo
論文名稱:生平類及定義類自動問答之研究
論文名稱(外文):A Study On Biographical and Definitional Question Answering
指導教授:林川傑林川傑引用關係
指導教授(外文):Chuan-Jie Lin
學位類別:碩士
校院名稱:國立臺灣海洋大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:英文
論文頁數:42
中文關鍵詞:問答系統複雜類型問句生平類定義類
外文關鍵詞:Question AnsweringComplex questionBiographical question answeringDefinitional question answering
相關次數:
  • 被引用被引用:0
  • 點閱點閱:164
  • 評分評分:
  • 下載下載:24
  • 收藏至我的研究室書目清單書目收藏:0
本論文的研究目的是設計一個能回答生平類及定義類問題的自動問答系統。生平類及定義類問題屬於複雜型問題,其答案不具固定型態,長短不一,較有挑戰性。

複雜型問題之自動問答系統包含了四個步驟:決定問題類型、抽取問句中的焦點對象、根據不同問題類型採取不同策略來搜尋相關文件以及尋找答案。前兩步驟我們實驗對象為各種已知的複雜型問題,後兩步驟則僅針對生平類及定義類問題來研究。

決定問題類型及抽取問句焦點的方法分為規則式及統計式兩種。規則式是以人工撰寫問題類型判斷規則,並寫出抽取問句焦點的句型。統計式則經由一群已標記有問題類型的問句來學習各問題類型常見之用語,統計所用單位嘗試了字元二元組 (bigram)、詞 (word) 及雙詞組合 (bi-word) 等。規則式效果較好,但統計式較能處理新句型。

生平類的問答策略是利用描述生平資訊的句型來找尋答案。我們利用網路百科全書所習得的人名及其生平資訊組合,在一個大型的語料庫中自動學習描述生平資訊的句型。在尋找答案時,先搜尋出現該人名的文章,再以句子為單位,比對生平資訊句型的出現。我們並以機器學習的方法來學習各種生平資訊句型的可信度。

定義類的問答策略則是利用網路的龐大資訊特性來找尋答案。先撰寫幾個基本的定義資訊句型,利用搜尋引擎找回許多可能的定義文句。在尋找答案時,先搜尋出現該定義名詞的文章,再以句子為單位,比對定義資訊與句子的相似度。我們並實驗了各種相似度比對的效能。
The goal of this thesis is to design a question answering (QA) system to answer biographical and definitional questions. Biographical and definitional questions are complex questions. Their answers are often presented in various ways and in various lengths. It is a challenging problem.

There are four modules in a complex QA system: answer type classification, question focus extraction, relevant document retrieval, and answer extraction according for different answer types. The first two modules are developed for all kinds of known complex questions. The last two modules are designed only for answering biographical and definitional questions.

We have experimented a rule-based system and many statistical methods to guess an answer type and extract the question focus. Rules are hand-crafted as patterns to decide answer types and the locations of question foci. A statistical system is trained by a set of annotated questions to learn frequent patterns for each answer types. The units of clues include bigrams, words, and bi-words. The rule-based system performs better but the statistical systems are more robust.

The main idea of finding answers to biographical questions is using biographical patterns. Patterns which are often used to describe biographical information are automatically learned from an on-line encyclopedia. When finding answers, documents containing the targeted person names are first retrieved. Biographical patterns are matched in each sentence in the documents. We have also verified the correctness of these biographical patterns by machine learning.

The main idea of finding answers to definitional questions is utilizing the web data. Several basic definitional patterns are used to retrieve possible definitional information from the web by a search engine. When finding answers, documents containing the targeted terms are first retrieved. The similarity between possible definitional information and each sentence in the documents is measured. Different similarity functions have been tested for achieving the best performance.
摘要(Chinese Abstract)
Abstract
謝誌(Acknowledgement)
Table of Contents
List of Tables
List of Figures

1 Introduction
1.1 Problem Definition
1.2 System Architecture
2 Question Analysis
2.1 Answer Type Definition
2.2 Rule-based Module
2.3 Clue-Detecting Modules
2.3.1 Clue-Term Learning
2.3.2 Answer Type Classification
2.3.3 Query Term Extraction
2.4 Voting Module
3 Biographical Question Answering
3.1 Wikipedia Approaches
3.1.1 Wikipedia Article Similarity
3.1.2 Wikipedia Infobox Information
3.2 Biographical Pattern Approach
3.2.1 Learning Biographical Patterns
3.2.2 Answer Candidate Selection and Answer Extraction
4 Definitional Question Answering
4.1 Wikipedia Approach
4.2 Clue Sentence Approach
4.2.1 External Snippets Retrieval
4.2.2 Clue Sentences Selection
4.2.3 Answer Candidates Selection and Answer Extracion
5 Experiments
5.1 Dataset Preparation
5.1.1 NTCIR Dataset
5.1.2 Term Frequency Data
5.1.3 ClueWeb09 Dataset
5.2 Question Analysis Performance
5.2.1 Answer Type Classification
5.2.2 Query Terms Extraction
5.3 Biographical Question Answering Performance
5.3.1 Wikipedia Approaches
5.3.2 Biographical Pattern Approach
5.4 Definitional Question Answering Performance
5.4.1 Wikipedia Approach
5.4.2 Web Approach
6 Conclusion
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top