(44.192.66.171) 您好!臺灣時間:2021/05/18 01:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:賴郁伶
研究生(外文):Yu-Ling Lai
論文名稱:PADMA:雙向多頭偕同注意力實現多選項之閱讀理解應用於數位學習歷史科目
論文名稱(外文):PADMA: Dual Multi-head Co-attention Multi-choice Reading Comprehension on History Subject for E-Learning
指導教授:吳曉光吳曉光引用關係
指導教授(外文):Hsiao-Kuang Wu
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:39
中文關鍵詞:深度學習多選項閱讀理解自然語言處理
外文關鍵詞:Deep LearningMulti-choice Machine Reading ComprehensionNatural Language Processing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:51
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在數位學習導入答題系統通常可作為虛擬助教,減輕老師負擔,隨時為學生提供幫助。給定文章段落及問句,多選項之閱讀理解任務需要模型從選項集預測正確答案,除了使用強大的語言模型作為編碼器,此任務常常需要透過文章、問句及選項的訊息比對,取得三者之間的關聯性,且目前許多方法只考慮問題感知之文章表示法而忽略文章感知之問題表示法。
本論文提出一個選擇題答題系統,利用中學課文做為知識來源,將文章、問句及選項序列透過編碼器轉換成含上下文訊息之向量表示法。我們模擬人類作答的思維,加入兩個解題方法:(1)文章句子選擇,擷取與問句關聯性最高的文章句子,(2)選項交互參考,四選項互相比對訊息後進行編碼,在我們的題庫上有更好的效果。再連接雙向多頭注意力機制產生進階表示法,經分類器產生預測答案。對公開資料集及我們的歷史題庫的實驗結果顯示,我們的模型與基準模型相比有更好的效能。
Given a passage and a question, multi-choice reading comprehension tasks require model to predict the correct answer from a set of candidate answer options by using a strong language model as encoder. The tasks usually need information comparison between passage, question and answer option to get the relevance among them. Most of the existing methods consider the question-aware passage representation but not passage-aware question representation.
In this paper, we will present the ‘PADMA’ stands for Passage sentence selection and Answer option interaction integrated on Dual Multi-head co-Attention. We propose a multi-choice question answering system PADMA, which collects the middle school textbook as a knowledge source, and encodes the sequence formed from passage, question and answer option to a contextualized vector representations. We simulate the way human solve multi-choice problem, and integrate two reading strategies: (1) passage sentence selection, which helps decide the most relevant sentence from the passage corresponding to the question; (2) Answer option interaction, which encodes bilinear representations between each two options. Then a dual multi-head co-attention model is used to generate the advanced representation and a decoder is to calculate the answer prediction. The experiment result shows that our proposed model achieves a better performance compared with base models.
摘要 I
Abstract II
致謝 III
Table of Contents IV
List of Figures VI
List of Tables VII
List of Equations VIII
1. Introduction 1
1.1 Background 1
1.2 Proposed Goal 2
1.3 Organization of Thesis 4
2. Related Works 5
2.1 Language Models 5
2.2 Multi-choice Machine Reading Comprehension 6
3. Model 8
3.1 Model Architecture 8
3.2 Task Definition 10
3.3 Encoder 10
3.4 Passage Sentence Selection 11
3.5 Answer Option Interaction 13
3.6 Dual Multi-head Co-Attention 14
3.7 Decoder 15
4. Experiments 16
4.1 Dataset and Evaluate Metrics 16
4.2 Experiment Setup 17
5. Results and Discussion 18
5.1 Input Format 19
5.2 Performance on RACE-M dataset 19
5.3 Performance on History dataset 20
6. Conclusion and Future Works 22
References 23
[1] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,” J. Mach. Learn. Res., vol. 12, no. null, p. 2493‘‘2537, Nov. 2011.
[2] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,” Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016. [Online]. Available: http://dx.doi.org/10.18653/v1/D16-1264
[3] T. Nguyen, M. Rosenberg, X. Song, J. Gao, S. Tiwary, R. Majumder, and L. Deng, “MS MARCO: A human generated machine reading comprehension dataset,” in Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, December 9, 2016, ser. CEUR Workshop Proceedings, T. R. Besold, A. Bordes, A. S. d’Avila Garcez, and G. Wayne, Eds., vol. 1773. CEUR-WS.org, 2016. [Online]. Available: http://ceur-ws.org/Vol-1773/CoCoNIPS\ 2016\ paper9.pdf
[4] A. Trischler, T. Wang, X. Yuan, J. Harris, A. Sordoni, P. Bachman, and K. Suleman, “Newsqa: A machine comprehension dataset,” Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017. [Online]. Available: http://dx.doi.org/10.18653/v1/W17-2623
[5] K. Sun, D. Yu, J. Chen, D. Yu, Y. Choi, and C. Cardie, “Dream: A challenge data set and models for dialogue-based reading comprehension,” Transactions of the Association for Computational Linguistics, vol. 7, p. 217‘‘231, Mar 2019. [Online]. Available: http://dx.doi.org/10.1162/tacl a 00264
[6] G. Lai, Q. Xie, H. Liu, Y. Yang, and E. Hovy, “Race: Largescale reading comprehension dataset from examinations,” Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017. [Online]. Available: http://dx.doi.org/10.18653/v1/D17-1082
[7] H. Zhu, F. Wei, B. Qin, and T. Liu, “Hierarchical attention flow for multiple-choice reading comprehension,” in AAAI, 2018.
[8] S. Wang, M. Yu, J. Jiang, and S. Chang, “A co-matching model for multi-choice reading comprehension,” Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2018. [Online]. Available: http://dx.doi.org/10.18653/v1/P18-2118
[9] K. Sun, D. Yu, D. Yu, and C. Cardie, “Improving machine reading comprehension with general reading strategies,” Proceedings of the 2019 Conference of the North, 2019. [Online]. Available: http://dx.doi.org/10.18653/v1/N19-1270
[10] S. Zhang, H. Zhao, Y. Wu, Z. Zhang, X. Zhou, and X. Zhou, “Dual co-matching network for multichoice reading comprehension.” CoRR, vol. abs/1901.09381, 2019. [Online]. Available: http://dblp.uni-trier.de/db/journals/ corr/corr1901.html#abs-1901-09381
[11] S. Zhang, H. Zhao, Y. Wu, Z. Zhang, X. Zhou, and X. Zhou, “DCMN+: Dual co-matching network for multi-choice reading comprehension, in AAAI. AAAI Press, 2020, pp. 9563–9570. [Online]. Available: http://dblp.uni-trier.de/db/conf/ aaai/aaai2020.html#ZhangZW0ZZ20
[12] Q. Ran, P. Li, W. Hu, and J. Zhou, “Option comparison network for multiple-choice reading comprehension,” arXiv preprint arXiv:1903.03033, 2019.
[13] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” 2014, cite arxiv:1409.0473Comment: Accepted at ICLR 2015 as oral presentation. [Online]. Available: http://arxiv.org/abs/1409.0473
[14] P. Zhu, H. Zhao, and X. Li, “Dual multi-head co-attention for multi-choice reading comprehension,” arXiv preprint arXiv:2001.09415, 2020.
[15] J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” arXiv preprint arXiv:1801.06146, 2018
[16] M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018. [Online]. Available: http://dx.doi.org/10.18653/ v1/N18-1202
[17] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018.
[18] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding.” in NAACL-HLT (1), J. Burstein, C. Doran, and T. Solorio, Eds. Association for Computational Linguistics, 2019, pp. 4171–4186. [Online]. Available: http://dblp.uni-trier. de/db/conf/naacl/naacl2019-1.html#DevlinCLT19
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, 2017, pp. 5998–6008.
[20] D. Chen, J. Bolton, and C. D. Manning, “A thorough examination of the cnn/daily mail reading comprehension task,” Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016. [Online]. Available: http://dx.doi.org/10.18653/v1/P16-1223
[21] B. Dhingra, H. Liu, Z. Yang, W. Cohen, and R. Salakhutdinov, “Gated-attention readers for text comprehension,” Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017. [Online]. Available: http://dx.doi.org/10.18653/v1/P17-1168
[22] H. Zhu, F. Wei, B. Qin, and T. Liu, “Hierarchical attention flow for multiple-choice reading comprehension,” 2018. [Online]. Available: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/ paper/view/16331
[23] P.-H. Li, T.-J. Fu, and W.-Y. Ma, “Why attention? analyze bilstm deficiency and its remedies in the case of ner,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, p. 8236‘‘8244, Apr 2020.
[24] R. K. Srivastava, K. Greff, and J. Schmidhuber, “Training very deep networks,” in Advances in neural information processing systems, 2015, pp. 2377–2385.
[25] “Nationwide junior high and primary school question bank website,” https://exam.naer.edu.tw/ Accessed July 21, 2020.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top