跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.176) 您好!臺灣時間:2025/09/08 04:05
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃晴
研究生(外文):Ching Huang
論文名稱:對抗式零樣本學習於跨語言及跨領域之文字蘊含識別
論文名稱(外文):Adversarial Training for Zero-Shot Cross-Lingual and Cross-Domain Textual Entailment Recognition
指導教授:陳信希陳信希引用關係
口試委員:鄭卜壬蔡宗翰陳冠宇
口試日期:2019-07-17
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:37
中文關鍵詞:文字蘊含零樣本學習跨語言學習跨領域學習對抗式學習MultiNLI資料集
DOI:10.6342/NTU201900744
相關次數:
  • 被引用被引用:0
  • 點閱點閱:270
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來,深度類神經網路在許多自然語言處理的議題上,都有十分傑出的表現,包含了這篇研究的主軸「文字蘊含」。文字蘊含是自然語言處理的領域中一個十分經典的議題,定義是在當給定了一個句子作為前提之下,是否能夠判斷另一個句子與前提句的關聯是:(1) 一定正確 (2) 一定錯誤,或者 (3) 毫無關係。文字蘊含的英文資料集像是Stanford Natural Language Inference (SNLI) 與Multi-Genre Natural Language Inference (MultiNLI) 都貢獻了許多由專業人員人工標記的資料,因此讓我們能夠訓練更複雜的深度學習模型。然而,這些資料集卻只有英文的樣本,因此除了英文以外的語言在文字蘊含的議題上,經常要面臨到人工標記資料不足的困擾。因此,這篇研究的目標是使用現有的英文文字蘊含資料集創造一個跨語言的文字蘊含模型。
最近,由Google提出的多語言的預訓練句子表示方式BERT,明顯地減輕了上述的問題。利用BERT為預訓練句子表現,再使用零樣本跨語言的學習方式,就能成功地應用於文字蘊含的議題上。而這篇論文提出了一個對抗式的訓練方式於零樣本的跨語言文字蘊含,能夠讓訓練集的語言與測試集的語言表現差距更加減少。基於我們零樣本的跨語言文字蘊含的成功,我們甚至將模型延伸到不只跨語言並同時跨領域的文字蘊含議題之上。只要使用了這篇論文中陳述的同時跨語言並跨領域的訓練機制,文字蘊含模型也能利用沒有標記的非英語的不同領域的資料來增強模型。實驗結果也證實了在上述所提的兩種情況之下,此篇論文的對抗式機制都能讓應用BERT之後的模型表現更上層樓。
Recently, deep neural networks have achieved astonishing performance in a variety of natural language processing tasks, including textual entailment recognition (TER) tasks which this paper focuses on. Textual entailment recognition (TER), also known as Natural Language Inference (NLI), is a classic natural language processing task whose goal is to determine whether a “hypothesis” sentence is true, false, or unrelated given a “premise” sentence. Textual Entailment Recognition datasets such as the Stanford Natural Language Inference (SNLI) corpus and the Multi-Genre Natural Language Inference (MultiNLI) corpus contributed a large amount of annotated data, which enabled the training of complex deep learning models. However, these corpora only contain English examples. Languages other than English often struggle with the problem of insufficient annotated data. As a result, this study aims to employ the available English NLI corpus but provide a cross-lingual Textual Entailment Recognition model.
The state-of-the-art multi-lingual sentence representation BERT, proposed by Google, has alleviated the specified problem by adopting zero-shot cross-lingual Textual Entailment Recognition. This paper proposes an adversarial training approach which further narrows the gap between the source and the target language. Moreover, based on the success of the zero-shot cross-lingual training, we extend the scenario to another adversarial training mechanism for zero-shot cross-lingual and cross-domain Textual Entailment Recognition. With the presented cross-lingual and cross-domain training mechanism, TER models could even utilize unlabeled out-of-domain and non-English training instances. Experimental results confirm that the pre-trained BERT sentence representations still benefit from the adversarial training in both scenarios.
誌謝 ii
中文摘要 iii
ABSTRACT v
CONTENTS vii
LIST OF FIGURES ix
LIST OF TABLES x
Chapter 1 Introduction 1
1.1 Textual Entailment Recognition 1
1.1.1 TER Introduction 1
1.1.2 Challenging Issues 2
1.2 Motivation 3
1.3 Goal 4
1.4 Thesis Structure 6
Chapter 2 Related Work 7
2.1 Adversarial Training 7
2.2 Cross-lingual Textual Entailment Recognition 8
2.3 Word and Sentence Representations 8
2.4 Domain Adaptation 9
Chapter 3 Cross-lingual Models 12
3.1 Overview 12
3.2 Network Architecture 12
3.3 Network Optimization 14
Chapter 4 Cross-lingual and Cross-domain Models 16
4.1 Overview 16
4.2 Self-training 17
4.3 Zero-shot Cross-lingual Adversarial Training 18
Chapter 5 Experiments and Discussion 20
5.1 Experimental Settings 20
5.2 Datasets 20
5.3 Zero-shot Cross-lingual Results 22
5.4 Zero-shot Cross-lingual and Cross-domain Results 26
5.5 Parameter Tuning 27
5.6 Discussion 29
Chapter 6 Conclusion and Future Work 32
6.1 Conclusion 32
6.2 Future Work 32

REFERENCE 34
Jorge A Balazs, Edison Marrese-Taylor, Pablo Loy-ola, and Yutaka Matsuo. 2017. Refining raw sentence representations for textual entailment recognition via attention. arXiv preprint arXiv:1707.03103.
Jeremy Barnes, Roman Klinger, and Sabine Schulte imWalde. 2018. Projecting embeddings for domain adaption: Joint modeling of sentiment analysis in diverse domains. arXiv preprint arXiv:1806.04381.
Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing(EMNLP). Association for Computational Linguistics.
Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Diana Inkpen, and Si Wei. 2018. Neural natural language inference models enhanced with external knowledge. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 2406–2417.
Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, and Kilian Weinberger. 2016. Adversarial deep averaging networks for cross-lingual sentiment classification. arXiv preprint arXiv:1606.01614.
Alexis Conneau, Guillaume Lample, Ruty Rinott, Ad-ina Williams, Samuel R Bowman, Holger Schwenk, and Veselin Stoyanov. 2018. Xnli: Evaluating cross-lingual sentence representations. arXiv preprintarXiv:1809.05053.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan,Pascal Germain, Hugo Larochelle, Francois Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 17(1):2096–2030
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672–2680. Curran Associates, Inc.
Viktor Hangya, Fabienne Braune, Alexander Fraser, and Hinrich Schutze. 2018. Two methods for domain adaptation of bilingual tasks: Delightfully simple and broadly applicable. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 810–820.
Dongyeop Kang, Tushar Khot, Ashish Sabharwal, and Eduard Hovy. 2018. Adventure: Adversarial training for textual entailment with knowledge-guided examples. arXiv preprint arXiv:1805.04680.
Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jian-feng Gao. 2019. Multi-task deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504.
梅家駒Jia-Ju Mei. 1983. 同義詞詞林 (Chinese Synonym Dictionary). 上海辭書出版社.
Matteo Negri, Alessandro Marchetti, Yashar Mehdad, Luisa Bentivogli, and Danilo Giampiccolo. 2013. Semeval-2013 task 8: Cross-lingual textual entailment for content synchronization. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 25–33, Atlanta, Georgia, USA. Association for Computational Linguistics.
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.
Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
Sebastian Ruder and Barbara Plank. 2018. Strong baselines for neural semi-supervised learning under domain shift. arXiv preprint arXiv:1804.09530.
Hideki Shima, Hiroshi Kanayama, Cheng-Wei Lee, Chuan-Jie Lin, Teruko Mitamura, Yusuke Miyao, Shuming Shi, and Koichi Takeda. 2011. Overview of ntcir-9 rite: Recognizing inference in text. In Ntcir.
Chaitanya Shivade, Preethi Raghavan, and Siddharth Patwardhan. 2016. Addressing limited data for textual entailment across domains. arXiv preprintarXiv:1606.02638.
Adina Williams, Nikita Nangia, and Samuel Bowman.2018. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1(Long Papers), pages 1112–1122. Association for Computational Linguistics.
David Yarowsky. 1995. Unsupervised word sense dis-ambiguation rivaling supervised methods. In 33rd annual meeting of the association for computational linguistics.
Zhi-Hua Zhou and Ming Li. 2005. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge & Data Engineering, (11):1529–1541.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top