跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.82) 您好!臺灣時間:2025/01/23 06:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:梁玉婷
研究生(外文):Yu-Ting Liang
論文名稱:意見問答系統中問題分析與答案區段選取之研究
論文名稱(外文):Question Analysis and Answer Passages Retrieval for Opinion Question Answering Systems
指導教授:陳信希陳信希引用關係
指導教授(外文):Hsin-Hsi Chen
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:英文
論文頁數:113
中文關鍵詞:opinion question answering systemquestion answeringquestion analysisanswer passages retrievalopinion extraction
外文關鍵詞:意見問答系統自動問答系統問題分析答案區段選取意見擷取
相關次數:
  • 被引用被引用:0
  • 點閱點閱:302
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
自動問答技術是自然語言處理領域中一個非常熱門的研究方向。使用者以自然語言輸入問題,自動問答系統綜合運用各種自然語言處理技術,來迅速且有效率地擷取相關答案以提供給使用者。使用者不單只對詢問事實有興趣,也會對詢問意見有興趣。本論文設計並實現一個意見問答系統,以供回答有關人們意見,感覺,或想法的問句。自動問答系統一般包括三個主要組成部分:問句分析、答案區段選取和答案抽取。我們的研究著重前兩部分,也就是問句分析和答案區段選取。
問句分析包含三個主要工作:問句類型,問句焦點,以及問句意見傾向的判定。我們定義六種意見問句類型,並提出雙層的問句分類器。第一層分類器分辨問句是問事實,亦或是意見。倘若是意見,第二層分類器進一步將此問句分類到定義的六種意見問句。這兩個分類器的效能分別可以達到87.8%和92.5%的F值。接著我們討論判斷問句焦點和問句意見傾向的方法。前者判斷出的焦點用來提交到資訊檢索系統中,以找出與問句相關的句子作為可能的答案句。後者判斷問句的意見傾向,並與可能的答案句比對,來保留具有相同意見傾向的答案句。
答案區段選取亦包含三個主要工作。在資訊檢索系統提供的相關句中,我們會判斷焦點(焦點領域識別)是否在意見範圍內(意見領域識別),如果是,再進一步判斷此範圍的意見傾向是否和問句相同(意見傾向偵測)。實驗包含了十八種不同的組合。其中最好的模型利用部分焦點比對,在意見範圍層次達到40.59%的F值。倘若去除相關性的影響,最好的模型之F值可以提升到87.18%。我們進一步將問句按照其主題分類並實驗,其結果顯示出不同主題的問句,有著不同的困難度。
最後我們總結目前的實驗結果,並提出一些相當有趣的議題,以供未來研究,進而實現一個完整的意見問答系統。
Question answering (QA) systems provide an elegant way for people to access an underlying knowledge base. Humans are not only interested in factual questions but also interested in opinions. In this thesis, an opinion QA system dealing with opinion questions are proposed. We attempt to investigate technologies of question analysis and answer passages retrieval.
For question analysis, six opinion question types are defined. A two-layered framework utilizing two question type classifiers is proposed. Algorithms for these two classifiers are discussed. The performance achieves 87.8% in general question classification and 92.5% in opinion question classification. The question’s focus and polarity are detected as well to form an IR query and sieve out relevant sentences which have the same polarity to the question.
For answer passages retrieval, three components are introduced. Relevant sentences retrieved by the IR system are further identified whether the focus (Focus Scope Identification) is in a scope of opinion text spans (Opinion Scope Identification) or not, and if yes, whether the polarity of the scope matches with the polarity of the question (Polarity Detection). A total of 18 combinations are proposed and experimented. The best model achieves 40.59% of F-measure using partial match at boundary level. With relevance issues removed, the F-measure of the best model boosts up to 87.18%. We further divide the experiment results by topics. The results indicate difficulties of different topics.
We conclude with some yet unsolvable but quite interesting problems to study in the future to build a complete opinion QA system.
Table of Contents
口試委員會審定書 i
誌謝 ii
中文摘要 iii
Abstract iv

Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Related Work 2
1.3 Main Issues 5
1.4 Thesis Structure 5

Chapter 2 Opinion Question Answering System Framework 7
2.1 Overview 7
2.2 Part-of-Speech Tagger 8
2.3 Question Analysis 8
2.3.1 Two-Layered Question Classification 9
2.3.2 Question Focus Extraction and Polarity Detection 10
2.4 Answer Passages Retrieval 11

Chapter 3 Experimental Corpus 13
3.1 Sources 13
3.1.1 TREC 13
3.1.2 NTCIR 15
3.1.3 Internet Polls 16
3.1.3.1 Chinatimes 16
3.1.3.2 Era 16
3.1.3.3 TVBS 17
3.1.4 OPQ 17
3.2 Questions Collection 19
3.2.1 TREC 19
3.2.2 NTCIR 19
3.2.3 Internet Polls 20
3.2.3.1 Chinatimes 20
3.2.3.2 Era 21
3.2.3.3 TVBS 22
3.2.3.4 Statistics for Polls 22
3.2.4 OPQ 23
3.2.5 Overall Statistics 24
3.3 Answers Annotation 24
3.3.1 Questions Analysis 25
3.3.1.1 Three Challenges on Holders 25
3.3.1.2 Two Challenges on Opinions 26
3.3.1.3 Two Challenges on Concepts 27
3.3.2 Questions Selection Strategy 29
3.3.3 Answers Annotation of OPQ 31

Chapter 4 Two-Layered Question Classification 33
4.1 The Purpose of Two Layers 33
4.2 Opinion Questions Taxonomy 35
4.3 Q-Classifier 37
4.3.1 Algorithm 38
4.3.2 Experiment 42
4.3.2.1 Experiment Setup 42
4.3.2.2 Experiment Results and Discussions 42
4.4 OPQ-Classifier 46
4.4.1 Algorithm 46
4.4.1.1 Grammatical Forms 46
4.4.1.2 Scoring Function 47
4.4.1.2.1 Feature Extraction 48
4.4.1.2.2 Opinion Question Type Assignation 51
4.4.2 Experiment 51
4.4.2.1 Experiment Setup 51
4.4.2.2 Experiment Results and Discussions 52

Chapter 5 Answer Passages Retrieval 55
5.1 Question Focus Extraction 55
5.2 Question Polarity Detection 57
5.3 Focus Scope Identification 59
5.3.1 Exact Match 59
5.3.2 Partial Match 59
5.3.3 Lenient 60
5.4 Opinion Scope Identification 62
5.4.1 Sentence Level 62
5.4.2 Subsentence Level 62
5.4.3 Boundary Level 62
5.4.4 Determining Whether the Opinion is Toward the Focus 66
5.5 Polarity Detection 67
5.5.1 Opinion Score Approach 67
5.5.2 Action Approach 69
5.6 Combined Approaches 72
5.6.1 Combined Approach on Focus Scope Identification 72
5.6.2 Combined Approach on Polarity Detection 73
5.7 Experiments 73
5.7.1 Evaluation Metrics 74
5.7.2 Experiment on Focus Scope Identification 75
5.7.2.1 Experiment Setup 75
5.7.2.2 Experiment Results and Discussions 75
5.7.3 Experiment on Opinion Scope Identification 78
5.7.3.1 Experiment Setup 78
5.7.3.2 Experiment Results and Discussions 78
5.7.4 Experiment on Polarity Detection 81
5.7.4.1 Experiment Setup 81
5.7.4.2 Experiment Results and Discussions 82
5.7.5 Experiment on Corpus of Different Relevance Degrees 84
5.7.5.1 Experiment Setup 85
5.7.5.2 Experiment Results and Discussions 85
5.7.6 Experiment on Combined Approaches 87
5.7.6.1 Experiment Setup 88
5.7.6.2 Experiment Results and Discussions 88
5.7.7 Experiment on Questions of Different Topics 89
5.7.7.1 Experiment Setup 90
5.7.7.2 Experiment Results and Discussions 90
5.7.8 Overall Error Analysis 92

Chapter 6 Conclusion and Future Works 97
6.1 Conclusion 97
6.2 Future Works 100


References 103
Appendix 107
І Details of the 60 Selected Opinion Questions 107
II A List of Linking Elements 109
Ш Action Seed Vocabulary of Do’s 111
IV Action Seed Vocabulary of Dont’s 113
Augusto, L. and Pizzato, S. (2004). “Using a Trie-based Structure for Question Analysis,” Proceedings of the Australasian Language technology Workshop 2004, pp. 25-31.
Cardie, C., Wiebe, J., Wilson, T. and Litman, D. (2003). “Combining Low-Level and Summary Representations of Opinions for Multi-Perspective Question Answering,” Proceedings of AAAI Spring Symposium Workshop, pp. 20-27
Chang, C.-Y. (1997). “A Discourse Analysis of Questions in Mandarin Conversation,” Master Thesis, National Taiwan University
Cui, H., Sun, R., Li, K., Kan, M., and Chua, T. (2005). “Question Answering Passage Retrieval Using Dependency Relations,” Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval, pp. 400-407.
de Hoon, M.J.L., Imoto, S., Nolan, J. and Miyano, S. (2004) "Open Source ClusteringSoftware", Bioinformatics 20(9), pp. 1453-1454
Green, B., Wolf, A., Chomsky, C. and Laughery K. (1961). “BASEBALL: An Automatic Question Answerer,” Proceedings Western Joint Computer Conference, Vol. 19, pp. 219-224.
Hovy, E., Gerber, L., Hermjakob, U., Lin, C. and Ravichandran, D. (2001). “Toward Semantics-Based Answer Pinpointing,” Proceedings of the DARPA Human Language Technology conference, pp. 339-345.
Ittycheriah, A., Franz, M., Zhu, W-J., Ratnaparkhi, A. and Mammone, R.J. (2001). “IBM’s Statistical Question Answering System,” Proceedings of the 9th Text Retrieval Conference, pp. 229-234.
Kim, S.-M. and Hovy, E. (2004). “Determining the Sentiment of Opinions,” Proceedings of the 20th International Conference on Computational Linguistics, pp. 1367-1373.
Ku, L.-W., Liang, Y.-T. and Chen, H.-H. (2006). “Opinion Extraction, Summarization and Tracking in News and Blog Corpora,” Proceedings of AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, AAAI Technical Report, pp. 100-107.
Lin, C.-J. (2004). A Study on Chinese Open-Domain Question Answering Systems, Ph.D. Thesis, National Taiwan University
Pang, B., Lee, L. and Vaithyanathan, S. (2002). “Thumbs up? Sentiment Classification Using Machine Learning Techniques,” Proceedings of the 2002 Conference on EMNLP, pp. 79-86.
Quinlan, J. R. (2000). Data Mining Tools See5 and C5.0. http://www.rulequest.com/see5-info.html
Riloff, E. and Wiebe, J. (2003). “Learning Extraction Patterns for Subjective Expressions,” Proceedings of the 2003 Conference on EMNLP, pp. 105-112.
Riloff, E., Wiebe, J. and Wilson, T. (2003). “Learning Subjective Nouns Using Extraction Pattern Bootstrapping,” Proceedings of Seventh Conference on Natural Language Learning, pp. 25-32.
Roberts, I. (2003). “Information Retrieval for Question Answering,” Master Thesis, University of Sheffield, UK.
Robertson, S.E., Walker, S. and Beaulieu, M. (1998). “Okapi at TREC-7: Automatic ad hoc, Filtering, VLC and Interactive,” Proceedings of the Seventh Text REtrieval Conference, pp. 253-264.
Stoyanov, V., Cardie, C. and Wiebe, J. (2005). “Multi-Perspective Question Answering Using the OpQA Corpus,” Proceedings of HLT/EMNLP 2005, pp. 923-930.
Tzeng, Y.-C. (2005). “A Study on Multilingual Question Answering System,” Master Thesis, National Taiwan University, Taiwan.
Voorhees, E. (2002). “Overview of the TREC 2001 Question Answering Track,” Proceedings of TREC 10, pp. 42–51.
Voorhees, E. (2003). “Overview of the TREC 2002 Question Answering Track,” Proceedings of TREC 11, pp. 53–75.
Voorhees, E. (2004). “Overview of the TREC 2003 Question Answering Track,” Proceedings of TREC 12, pp. 54–68.
Voorhees, E. (2005). “Overview of the TREC 2004 Question Answering Track,” Proceedings of TREC 13, pp. 52–62.
Wiebe, J. (2000). “Learning Subjective Adjectives from Corpora,” Proceeding of 17th National Conference on Artificial Intelligence, pp. 735-740.
Wiebe, J., Breck, E., Buckly, C., Cardie, C., Davis, P., Fraser, B., Litman, D., Pierce, D., Riloff, E. and Wilson, T. (2002) “NRRC Summer Workshop on Multi-Perspective Question Answering,” ARDA NRRC Summer 2002 Workshop.
Wilson, T., Wiebe, J. and Hoffmann, P. (2005). “Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis,” Proceedings of HLT/EMNLP 2005, pp. 347-354.
Woods, W. (1973). “Progress in Natural Language Understanding – An Application to Lunar Geology,” American Federation of Information Processing Societies (AFIPS) Conference Proceedings, Vol. 42, pp. 441-450.
Yu, H. and Hatzivassiloglou, V. (2003). “Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences.” Proceedings of HLT/EMNLP 2003, pp. 129-136.
Zhang, D. and Lee, W.S. (2003). “Question Classification Using Support Vector Machines,” Proceedings of SIGIR 2003, pp. 26-32.
程祥徽, 田小琳(1989)現代漢語, 書林出版有限公司
朱怡霖(2002)中文斷詞與專有名詞辨識之研究, 碩士論文, 國立台灣大學資訊工程所,台北, 2002.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top