跳到主要內容

臺灣博碩士論文加值系統

(44.192.95.161) 您好!臺灣時間:2024/10/04 12:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:莊文立
研究生(外文):Wen Li Zhuang
論文名稱:基於遞迴神經網路的指代消解
論文名稱(外文):Coreference Resolution Using Recurrent Neural Networks
指導教授:陳信希陳信希引用關係
指導教授(外文):Hsin-Hsi Chen
口試委員:古倫維林川傑馬偉雲
口試委員(外文):Lun-Wei KuChuan-Jie LinWei-Yun Ma
口試日期:2016-07-21
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:40
中文關鍵詞:指代消解先行詞排序遞迴神經網路注意力機制
外文關鍵詞:Coreference resolutionantecedent rankingrecurrent neural networksattention mechanism
相關次數:
  • 被引用被引用:0
  • 點閱點閱:277
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
指代(或譯作同指涉)消解是自然語言處理的經典未解之問題。我們提出一種全新的先行詞排序模型,利用階層式遞迴神經網路,先用一遞迴網路依文章的語境建造「提及語義」的表達式,再訓練另一個遞迴網路,使其善用剛剛學習出的表達式,搭配注意力機制,偵測照應詞及其指代之先行詞。我們的系統在CoNLL 2012的共享任務中,拿到了目前最高的分數。

Coreference resolution is a classic unsolved problem in natural language processing. We present a novel antecedent ranking model based on hierarchical recurrent neural networks (RNN). The word-level RNN encodes the context into the representation of mention. The mention-level network is trained to learn to exploit these useful representation and few hand-crafted features to detect anaphora and its antecedent by simple attention mechanism. We evaluate our system on CoNLL 2012 shared task and set up a new state-of-the-art.

誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Rule-based coreference resolution . . . . . . . . . . . . . . . . . . . . . 4
2.2 Learning-based coreference resolution . . . . . . . . . . . . . . . . . . . 5
2.2.1 Antecedent ranking . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Entity clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 Latent antecedent . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Deep neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Deep learning in NLP . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 Pointer Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Recurrent neural networks . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 Vanilla recurrent neural nets . . . . . . . . . . . . . . . . . . . . 10
3.1.2 Long Short-Term Memory . . . . . . . . . . . . . . . . . . . . . 11
3.1.3 Bi-directional RNN . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Pointer networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Attention mechanism . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 Attention as index . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Antecedent ranking model . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 Problem setting . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Mention generation . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.3 Coreferent pointer network . . . . . . . . . . . . . . . . . . . . . 16
3.3.4 Mention ranking model . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.5 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Antecedent ranking and mention ranking . . . . . . . . . . . . . . . . . . 24
4.4 Influence from document genre . . . . . . . . . . . . . . . . . . . . . . . 25
4.5 Attentive network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.6 Word embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.7 Pretraining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.8 Training details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.9 Final system result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Bagga, A. and Baldwin, B. (1998). Algorithms for scoring coreference chains. In In The First International Conference on Language Resources and Evaluation Workshop onLinguistics Coreference.
Björkelund, A. and Kuhn, J. (2014). Learning structured perceptrons for coreference resolution with latent antecedents and non-local features. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),pages 47–57. Association for Computational Linguistics.
Clark, K. and Manning, C. D. (2015). Entity-centric coreference resolution with model stacking. In Association of Computational Linguistics (ACL).
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12(Aug):2493–2537.
Do, T. Q. N., Bethard, S., and Moens, M.-F. (2015). Adapting coreference resolution for narrative processing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 2262–2267. Association for Computational Linguistics.
Durrett, G. and Klein, D. (2013). Easy victories and uphill battles in coreference resolution. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1971–1982, Seattle, Washington, USA. Association for Computational Linguistics.
Fernandes, E. R., Dos Santos, C. N., and Milidiú, R. L. (2012). Latent structure perceptron with feature induction for unrestricted coreference resolution. In Joint Conference on EMNLP and CoNLL-Shared Task, pages 41–48. Association for Computational Linguistics.
Fernandes, E. R., dos Santos, C. N., and Milidiú, R. L. (2014). Latent trees for coreference resolution. Computational Linguistics.
Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., and Schmidhuber, J. (2015). Lstm: A search space odyssey. arXiv preprint arXiv:1503.04069.
Hermann, K. M., Kočiskỳ, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., and Blunsom, P. (2015). Teaching machines to read and comprehend. arXiv preprint arXiv:1506.03340.
Hobbs, J. (1976). Pronoun resolution. research report76-1. new york: Department of computer science. City University of New York.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory.
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., and Weischedel, R. (2006). Ontonotes: the 90% solution. In Proceedings of the human language technology conference of the NAACL, Companion Volume: Short Papers, pages 57–60. Association for Computational Linguistics.
Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Luo, X. (2005). On coreference resolution performance metrics. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing.
Martschat, S. and Strube, M. (2015). Latent structures for coreference resolution. Transactions of the Association for Computational Linguistics, 3:405–418.
Mikolov, T., Kombrink, S., Deoras, A., Burget, L., and Cernocky, J. (2011). Rnnlmrecurrent neural network language modeling toolkit. In Proc. of the 2011 ASRU Workshop, pages 196–201.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119.
Nicolae, C. and Nicolae, G. (2006). Bestcut: A graph algorithm for coreference resolution. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 275–283. Association for Computational Linguistics.
Pascanu, R., Mikolov, T., and Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In Proceedings of The 30th International Conference on Machine Learning, pages 1310–1318.
Peng, H., Chang, K.-W., and Roth, D. (2015). A joint framework for coreference resolution and mention head detection. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning, pages 12–21. Association for Computational Linguistics.
Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., and Zhang, Y. (2012). CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes. In Proeedings of the Sixteenth Conference on Computational Natural Language Learning (CoNLL 2012), Jeju, Korea.
Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., and Manning, C. (2010). A multi-pass sieve for coreference resolution. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 492–501. Association for Computational Linguistics.
Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning, C. D., Ng, A. Y., and Potts,C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the conference on empirical methods in natural language processing(EMNLP), volume 1631, page 1642. Citeseer.
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929–1958.
Stoyanov, V., Cardie, C., Gilbert, N., Riloff, E., Buttler, D., and Hysom, D. (2010). Coreference resolution with reconcile. In Proceedings of the ACL 2010 Conference Short Papers, pages 156–161. Association for Computational Linguistics.
Sukhbaatar, S., Weston, J., Fergus, R., et al. (2015). End-to-end memory networks. In Advances in neural information processing systems, pages 2440–2448.
Vilain, M., Burger, J., Aberdeen, J., Connolly, D., and Hirschman, L. (1995). A modeltheoretic coreference scoring scheme. In Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995.
Vinyals, O., Fortunato, M., and Jaitly, N. (2015). Pointer networks. In Advances in Neural Information Processing Systems, pages 2692–2700.
Weston, J. (2016). Dialog-based language learning. CoRR, abs/1604.06045.
Wiseman, S., Rush, A. M., Shieber, S., and Weston, J. (2015). Learning anaphoricity and antecedent ranking features for coreference resolution. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1416–1426, Beijing, China. Association for Computational Linguistics.
Wiseman, S., Rush, M. A., and Shieber, M. S. (2016). Learning global features for coreference resolution. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 994–1004. Association for Computational Linguistics.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top