跳到主要內容

臺灣博碩士論文加值系統

(44.220.247.152) 您好!臺灣時間:2024/09/09 08:12
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:鄭皓澤
研究生(外文):Hao-TseCheng
論文名稱:基於兩階段轉換器之可變長度抽象摘要
論文名稱(外文):Variable-Length Abstractive Summarization using Two-stage Transformer-based Method
指導教授:吳宗憲吳宗憲引用關係
指導教授(外文):Chung-Hsien Wu
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:66
中文關鍵詞:自動化摘要系統抽象式摘要抽取式摘要文本分割可變長度摘要TransformerBERT長短期記憶模型
外文關鍵詞:automatic summarization systemabstractive summarizationextractive summarizationtext segmentationvariable-length summarizationTransformerBERTLSTM
相關次數:
  • 被引用被引用:0
  • 點閱點閱:181
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來資訊量的快速增長已經使如何快速瀏覽大量文章成為每個人都要面對的問題,自動化摘要系統可以協助人們解決此問題,使人們在獲取文章內容的同時省去非常大量的時間。自動化摘要系統可區分為抽取式摘要與抽象式摘要,前者可由使用者自行決定產生摘要長度;後者則能產生更符合人類寫出且較通順的摘要。
本論文主要貢獻為提出兩階段模型訓練方式完成一個可變長度抽象摘要模型的訓練,成功解決過去自動化摘要系統無法兼顧可變長度與流暢度兩者的問題。可變長度抽象摘要模型包含數個子模型,在文本分割模型方面,本論文提出的系統成功結合目前最先進的語言表徵模型BERT與雙向長短期記憶模型完成文本分割任務並增進其效果;在摘要生成模型方面,本論文結合抽取式摘要與抽象式摘要,成功使用Transformer在產生文章的標題摘要上與目前最先進的模型有相近的效果。
本論文整理一個新的大型中文文本分割語料庫ChWiki_181k,並提出一個基於BERT的文本分割模型作為該語料庫的基準模型供後續研究做比較。在摘要方面使用LCSTS語料庫,成功使用兩階段模型完成訓練可變長度抽象摘要系統,並在人工主觀評測上最高準確率達70%,證明本論文所提之架構能產生有效的可變長度摘要。
Due to the rapid growth of information available, how to efficiently process and utilize these text-based resources has become an increasingly crucial challenge to address. Such a problem can be solved with an automatic summarization system. Most summarization systems are divided into two types: extractive methods and abstractive methods. Extractive methods form the summary by extracting segments of text from the document. Abstractive methods process the document and then generate a text summary. The former can allow the user to specify the length of the summary, while the latter is able to produce a more fluent and human-like summary.
The main contribution of this thesis is to propose a two-stage method for training the variable-length abstractive summarization model. This is an improvement over previous models that cannot simultaneously achieve fluency and variable length for the summarization results. The variable-length abstractive summarization model is divided into a text segmentation module and three generation modules. The proposed text segmentation module, which utilizes BERT and Bidirectional LSTM, shows improved performance over existing methods. The generation modules combine extractive and abstractive methods to produce near state-of-the-art headline summaries.
A new large-scale Chinese text segmentation dataset called ChWiki_181k is introduced. A BERT-based text segmentation model is proposed to be the baseline model on ChWiki_181k. LCSTS is adopted to train summarization models, and a variable-length abstractive summarization system is trained with a two-stage method. The proposed variable-length abstractive summarization system achieved a maximum of 70% accuracy in human subjective evaluation, and the experimental result has shown the proposed model could generate proper variable-length summaries.
摘要 I
Abstract II
誌謝 IV
Contents V
List of Tables VII
List of Figures VIII
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 3
1.3 Literature Review 5
1.3.1 Abstractive Text Summarization 5
1.3.2 Extractive Text Summarization 6
1.3.3 Text Segmentation 7
1.3.4 Sequence to Sequence Model 8
1.3.5 Pre-training Method 10
1.4 Problems 11
1.5 Proposed Methods 13
Chapter 2 System Framework 15
2.1 Text Segmentation Model 16
2.1.1 BERT 17
2.1.2 LSTM Model 21
2.1.3 BERT-based Text Segmentation Model 23
2.2 Extractive Model 26
2.3 Document Summarization Model 29
2.3.1 Transformer 29
2.3.2 Summarization Combining Extraction and Abstraction 33
2.4 Segment Summarization Model 34
2.4.1 Segment transformer model 35
2.4.2 Loss Values of Two-stage Model 37
Chapter 3 Experimental Results and Discussion 40
3.1 Evaluation Metrics 40
3.1.1 Pk Indicator 40
3.1.2 ROUGE 41
3.1.3 Subjective Evaluation Method 42
3.2 Dataset 43
3.2.1 Text Segmentation Corpus 43
3.2.2 Summarization Corpus 45
3.3 Experimental Results and Discussion 46
3.3.1 Evaluation of the Text Segmentation Model BERT-biLSTM 46
3.3.2 Evaluation of the Ablation of the Extractive Model 49
3.3.3 Evaluation of the Document Transformer Model 50
3.3.4 Evaluation of the Variable-Length Abstractive Summary 54
Chapter 4 Conclusion and Future Work 60
Reference 62
[1]Wikipedia. https://www.wikipedia.org/ (accessed 19 June, 2019).
[2]中文維基百科. https://zh.wikipedia.org/wiki/%E4%B8%AD%E6%96%87%E7%BB%B4%E5%9F%BA%E7%99%BE%E7%A7%91 (accessed 19 June, 2019).
[3]S. Primativo, D. Spinelli, P. Zoccolotti, M. De Luca, and M. Martelli, Perceptual and cognitive factors imposing “speed limits on reading rate: a study with the rapid serial visual presentation, PloS one, vol. 11, no. 4, p. e0153786, 2016.
[4]D. Roth. The Answer Factory: Demand Media and the Fast, Disposable, and Profitable as Hell Media Model. https://www.wired.com/2009/10/ff-demandmedia/ (accessed 19 June, 2019).
[5]運動公社. 從查比高恩斯空難與轉會流言——體育新聞的內容農場現象. https://www.hk01.com/01%E5%8D%9A%E8%A9%95-Sports/61554/%E5%BE%9E%E6%9F%A5%E6%AF%94%E9%AB%98%E6%81%A9%E6%96%AF%E7%A9%BA%E9%9B%A3%E8%88%87%E8%BD%89%E6%9C%83%E6%B5%81%E8%A8%80-%E9%AB%94%E8%82%B2%E6%96%B0%E8%81%9E%E7%9A%84%E5%85%A7%E5%AE%B9%E8%BE%B2%E5%A0%B4%E7%8F%BE%E8%B1%A1 (accessed 19 June, 2019).
[6]A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, Language models are unsupervised multitask learners, OpenAI Blog, vol. 1, no. 8, 2019.
[7]OpenAI, Better Language Models and Their Implications, ed, February 14, 2019.
[8]H. P. Luhn, The automatic creation of literature abstracts, IBM Journal of research and development, vol. 2, no. 2, pp. 159-165, 1958.
[9] Y. Gong and X. Liu, Generic text summarization using relevance measure and latent semantic analysis, in Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, 2001: ACM, pp. 19-25.
[10] R. Mihalcea and P. Tarau, Textrank: Bringing order into text, in Proceedings of the 2004 conference on empirical methods in natural language processing, 2004.
[11] M. Afsharizadeh, H. Ebrahimpour-Komleh, and A. Bagheri, Query-oriented text summarization using sentence extraction technique, in 2018 4th International Conference on Web Research (ICWR), 2018: IEEE, pp. 128-132.
[12]K. Al-Sabahi, Z. Zuping, and M. J. I. A. Nadher, A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS), vol. 6, pp. 24205-24212, 2018.
[13]K.-Y. Chen, S.-H. Liu, B. Chen, H.-M. J. I. A. T. o. A. Wang, Speech,, and L. Processing, An Information Distillation Framework for Extractive Summarization, vol. 26, no. 1, pp. 161-170, 2018.
[14]Y. Liu, Fine-tune BERT for Extractive Summarization, arXiv preprint arXiv:1903.10318.
[15]A. M. Rush, S. Chopra, and J. Weston, A neural attention model for abstractive sentence summarization, arXiv preprint arXiv:1509.00685, 2015.
[16] S. Chopra, M. Auli, and A. M. Rush, Abstractive sentence summarization with attentive recurrent neural networks, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 93-98.
[17]B. Hu, Q. Chen, and F. Zhu, Lcsts: A large scale chinese short text summarization dataset, arXiv preprint arXiv:1506.05865, 2015.
[18]P. Li, W. Lam, L. Bing, and Z. Wang, Deep recurrent generative decoder for abstractive text summarization, arXiv preprint arXiv:1708.00625, 2017.
[19]H. Zhang et al., Pretraining-Based Natural Language Generation for Text Summarization, arXiv preprint arXiv:1902.09243, 2019.
[20]B. Wei, X. Ren, Y. Zhang, X. Cai, Q. Su, and X. Sun, Regularizing output distribution of abstractive chinese social media text summarization for improved semantic consistency, ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), vol. 18, no. 3, p. 31, 2019.
[21] C. Li, W. Xu, S. Li, and S. Gao, Guiding generation for abstractive text summarization based on key information guide network, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 2018, pp. 55-60.
[22] H. Wang, L. Jing, and H. Shao, Research on method of sentence similarity based on ontology, in 2009 WRI Global Congress on Intelligent Systems, 2009, vol. 2: IEEE, pp. 465-469.
[23]R. Nallapati, B. Zhou, C. Gulcehre, and B. Xiang, Abstractive text summarization using sequence-to-sequence rnns and beyond, arXiv preprint arXiv:1602.06023, 2016.
[24]K. Cho et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv preprint arXiv:1406.1078, 2014.
[25] I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Advances in neural information processing systems, 2014, pp. 3104-3112.
[26]S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[27]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805, 2018.
[28]A. See, P. J. Liu, and C. D. Manning, Get to the point: Summarization with pointer-generator networks, arXiv preprint arXiv:1704.04368, 2017.
[29] G. Glavaš, F. Nanni, and S. P. Ponzetto, Unsupervised text segmentation using semantic relatedness graphs, 2016: Association for Computational Linguistics.
[30] P. Badjatiya, L. J Kurisinkel, M. Gupta, and V. Varma, Attention-Based Neural Text Segmentation, in Advances in Information Retrieval, Cham, G. Pasi, B. Piwowarski, L. Azzopardi, and A. Hanbury, Eds., 2018// 2018: Springer International Publishing, pp. 180-193.
[31]O. Koshorek, A. Cohen, N. Mor, M. Rotman, and J. Berant, Text segmentation as a supervised learning task, arXiv preprint arXiv:1803.09337, 2018.
[32]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[33]O. Vinyals and Q. Le, A neural conversational model, arXiv preprint arXiv:1506.05869, 2015.
[34]O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, Show and tell: Lessons learned from the 2015 mscoco image captioning challenge, IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 4, pp. 652-663, 2016.
[35] S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, and K. Saenko, Sequence to sequence-video to text, in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4534-4542.
[36]D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv:1409.0473, 2014.
[37]Z. Lin et al., A structured self-attentive sentence embedding, arXiv preprint arXiv:1703.03130, 2017.
[38] A. Vaswani et al., Attention is all you need, in Advances in Neural Information Processing Systems, 2017, pp. 5998-6008.
[39]A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, Improving language understanding by generative pre-training, URL https://s3-us-west-2. amazonaws. com/openai-assets/research-covers/languageunsupervised/language understanding paper. pdf, 2018.
[40]E. Loper and S. Bird, NLTK: the natural language toolkit, arXiv preprint cs/0205028, 2002.
[41]M. E. Peters et al., Deep contextualized word representations, arXiv preprint arXiv:1802.05365, 2018.
[42] R. Socher et al., Recursive deep models for semantic compositionality over a sentiment treebank, in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631-1642.
[43]A. Warstadt, A. Singh, and S. R. Bowman, Neural network acceptability judgments, arXiv preprint arXiv:1805.12471, 2018.
[44]A. Williams, N. Nangia, and S. R. Bowman, A broad-coverage challenge corpus for sentence understanding through inference, arXiv preprint arXiv:1704.05426, 2017.
[45]P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, Squad: 100,000+ questions for machine comprehension of text, arXiv preprint arXiv:1606.05250, 2016.
[46]A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, Glue: A multi-task benchmark and analysis platform for natural language understanding, arXiv preprint arXiv:1804.07461, 2018.
[47]D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia, Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation, arXiv preprint arXiv:1708.00055, 2017.
[48] W. B. Dolan and C. Brockett, Automatically constructing a corpus of sentential paraphrases, in Proceedings of the Third International Workshop on Paraphrasing (IWP2005), 2005.
[49] K. M. Hermann et al., Teaching machines to read and comprehend, in Advances in neural information processing systems, 2015, pp. 1693-1701.
[50] J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, Convolutional sequence to sequence learning, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017: JMLR. org, pp. 1243-1252.
[51] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[52]J. L. Ba, J. R. Kiros, and G. E. Hinton, Layer normalization, arXiv preprint arXiv:1607.06450, 2016.
[53]D. Britz, A. Goldie, M.-T. Luong, and Q. Le, Massive exploration of neural machine translation architectures, arXiv preprint arXiv:1703.03906, 2017.
[54]D. Beeferman, A. Berger, and J. Lafferty, Statistical models for text segmentation, Machine learning, vol. 34, no. 1-3, pp. 177-210, 1999.
[55] C.-Y. Lin, Rouge: A package for automatic evaluation of summaries, in Text summarization branches out, 2004, pp. 74-81.
[56]J. R. Landis and G. G. Koch, The measurement of observer agreement for categorical data, biometrics, pp. 159-174, 1977.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top