|
[1] P. Rajpurkar et al., “SQuAD: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, November 1-5, 2016, pp. 2383-2392. [2] C. C. Shao et al., “DRCD: A Chinese machine reading comprehension dataset,” arXiv preprint arXiv:1806.00920, 2018. [3] J. Pennington, R. Socher and C. D. Manning. “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, October 25-29, 2014, pp. 1532-1543. [4] Y. Song et al., “Directional skip-gram: Explicitly distinguishing left and right context for word embeddings,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, June 1-6, 2018, pp. 175-180. [5] D. Bahdanau, K. Cho and Y. Bengio. “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014. [6] A. Vaswani et al., “Attention is all you need,” in Proceedings of the Annual Conference on Neural Information Processing Systems, Long Beach, CA, December 4-9, 2017, pp. 5998-6008. [7] S. Wang and J. Jiang. “Machine comprehension using Match-LSTM and answer pointer,” arXiv preprint arXiv:1608.07905, 2016. [8] M. Seo et al., “Bidirectional attention flow for machine comprehension,” in Proceedings of International Conference on Learning Representations, Toulon, France, April 24-26, 2017.
[9] Y. Gong and S. R. Bowman . “Ruminating reader: Reasoning with gated multi-hop attention,” arXiv preprint arXiv:1704.07415, 2017. [10] Natural Language Computing Group, Microsoft Research Asia. “R-NET: machine reading comprehension with self-matching networks,” in Proceedings of Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, July 30-August 4, 2017, pp. 189-198. [11] M. Hu et al., “Reinforced mnemonic reader for machine reading comprehension,” in Proceedings of 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, July 13-19, 2018, pp. 4099-4106. [12] A. W. Yu et al., “QANet: Combining local convolution with global self-attention for reading comprehension,” in Proceedings of International Conference on Learning Representations, Vancouver, Canada, April 30-May 3, 2018. [13] J. Devlin et al., “BERT: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018. [14] K. He et al., “Deep residual learning for image recognition,” in Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, June 26-July 1, 2016, pp. 770-778. [15] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, July 22-25, 2017, pp. 1800-1807. [16] R. K. Srivastava, K. Greff and J. Schmidhuber, “Highway networks,” arXiv preprint arXiv:1505.00387, 2015. [17] J. L. Ba, J. R. Kiros and G. E. Hinton, “Layer normalization,” arXiv preprint arXiv: 1607.06450, 2016.
|