跳到主要內容

臺灣博碩士論文加值系統

(44.201.94.236) 您好!臺灣時間:2023/03/24 11:09
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳奕先
研究生(外文):I-Hsien Chen
論文名稱:利用類神經網路正則化相異實體名稱
論文名稱(外文):Neural Normalization of Diverse Entity Labels
指導教授:鄭卜壬鄭卜壬引用關係
指導教授(外文):Pu-Jen Cheng
口試委員:邱志義魏志達
口試委員(外文):Chih-Yi ChiuJyh-Da Wei
口試日期:2020-07-29
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:中文
論文頁數:24
中文關鍵詞:自然語言處理文字生成任務指標生成網路多任務學習實體標籤正則化加權損失函數
外文關鍵詞:Natural Language ProcessingText Generation TasksPointer-Generator NetworkMulti-Tasking LearningNormalization of Entity LabelsWeighted Loss Function
DOI:10.6342/NTU202002051
相關次數:
  • 被引用被引用:0
  • 點閱點閱:146
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
  實體標籤,用於表示對實體的稱呼或描述,其格式通常沒有一致的規範。對於標籤的多樣化,可以大致分成相異類別以及相異風格。對不同類別的實體,例如學校或銀行,因為其慣用名稱通常有明顯的差異,因此標籤也有相對應的差別;而對於同樣類別的實體,其標籤卻也可能因為用途與來源的不同而呈現相異風格,例如對於學校正式與非正式的稱呼。本文中所使用的數據集由大量電話號碼組成,這些電話號碼作為「實體」,包含了各式各樣的實體類別,例如政府機關、餐廳、公司行號等等;而每個電話號碼的擁有者可能會有多個不同來源的名稱,因此格式用法也大相逕庭,可以看作為相異風格的「標籤」,因此我們所處理的該資料集可以說是同時涵蓋了兩種多樣化的概念。
  對於多樣化的實體標籤,我們希望透過類神經網路來進行正規化,使得每個實體能獲得單一的標籤作為代表。網路模型的部分,使用文本摘要模型作為基本框架,並在訓練過程中使用加權的損失函數,令目標函數能夠更加適合我們的任務。最後引入多任務學習的方法,利用輔助任務來幫助模型學習。
  最後在實驗的部份,我們會提出針對本文的資料集所設計的前處理方法。接著比較幾種模型和訓練方式的表現差異,觀察輸出結果、探討模型的表現並解釋其原因,以證明本文提出的方法的效果。同時我們也會對錯誤的部份進行更深入的觀察及討論。
Entity labels are used to indicate the name or description of entities, and their format usually do not have consistent specification. The diversification of labels can be roughly divided into different categories and different styles. For entities of different categories, such as schools or banks, there are usually great differences in their idiomatic names, so the labels also have differences. And for entities of the same category, their labels may also show different styles due to different uses and sources, such as formal and informal names for schools. In this paper, the dataset consists of large number of telephone numbers, which are “entities” belonging to diversified entity categories, such as government agencies, restaurants, companies, etc. And each phone number owner may have multiple names from different sources, which are “labels” in different styles. Therefore, our dataset covers the two kinds of diversification at the same time.
For diverse entities labels, we hope to normalize them through neural networks, so that each entity can obtain a single label as a representative. We take text summarization model as the basic framework, and a weighted loss function is used in the training process to make the objective function more suitable for our task. Finally, multi-task learning is introduced, and an auxiliary task are used to help the model to learn better.
Finally, we will propose a pre-processing method for our dataset. Then compare the performance of different models, observe the output texts, and give an explanation, to prove the effect of the methods we proposed. Also, we will conduct more in-depth observation and discussion on the wrong part.
致謝..……………………………………………………………………………………. i
中文摘要..……………………………………………………………………………… ii
Abstract.……………………………………………………………………………….. iii
一、 研究介紹...………………………………………………………………………....1
二、 相關研究………………………………………………………………………...... 4
2.1類神經網路於文字生成任務……………………………………………….4
2.2關聯任務:語法錯誤校正……………………………………………….…4
2.3關聯任務:文本摘要生成……………………………………………….…5
2.4多任務學習……………………………………………………………….…6
三、 模型架構及訓練方法……………………………………………………………...7
3.1指標生成網路……………………………………………………………….7
3.2加權損失函數……………………………………………………………….8
3.3輔助任務…………………………………………………………………… 9
四、 實驗結果………………………………………………………………………….12
4.1資料集簡介………………………………………………………………...12
4.2前處理……………………………………………………………………...13
4.3模型表現及分析…………………………………………………………...14
4.4錯誤觀察…………………………………………………………………...19
五、 研究結論………………………………………………………………………… 21
參考文獻...……………………………………………………………………………..22
1.Sutskever, I., Vinyals, O., and Le, Q. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems.
2.Bahdanau, D., Cho , K., Bengio ,Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 [cs.CL].
3.Junczys-Dowmunt, M., Grundkiewicz, R., Guha, S., Heafield, K. (2018). Approaching neural grammatical error correction as a low-resource machine translation task. arXiv preprint arXiv:1804.05940.
4.Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi (2019). BERTScore: Evaluating Text Generation with BERT. arXiv preprint arXiv:1904.09675.
5.Roman Grundkiewicz, Marcin Junczys-Dowmunt (2018). Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation. arXiv preprint arXiv:1804.05945 [cs.CL].
6.Tao Ge, Furu Wei, Ming Zhou (2018). Fluency Boost Learning and Inference for Neural Grammatical Error Correction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.
7.Yoshimoto, I., Kose, T., Mitsuzawa, K., Sakaguchi, K., Mizumoto ,T., Hayashibe, Y., Komachi, M., Matsumoto, Y. (2013). NAIST at 2013 CoNLL grammatical error correction shared task. In Proceedings of the 17th Conference on Computational Natural Language Learning.
8.Marcin Junczys-Dowmunt, Roman Grundkiewicz (2016). Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction. arXiv preprint arXiv:1605.06353 [cs.CL].
9.Yiming Wang, Longyue Wang, Derek F. Wong, Lidia S. Chao, Xiaodong Zeng, Yi Lu. (2014). Factored Statistical Machine Translation for Grammatical Error Correction. In Proceedings of the 8th Conference on Computational Natural Language Learning.
10.Tao Ge, Furu Wei, Ming Zhou (2018). Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study. arXiv preprint arXiv:1807.01270 [cs.CL].
11.Zheng Yuan, Ted Briscoe (2016). Grammatical error correction using neural machine translation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
12.Wei Zhao, Liang Wang, Kewei Shen, Ruoyu Jia, Jingming Liu (2019). Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data. arXiv preprint arXiv:1903.00138 [cs.CL].
13.Ramesh Nallapati, Bowen Zhou, Cicero Nogueira dos santos, Caglar Gulcehre, Bing Xiang (2016). Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. arXiv preprint arXiv:1602.06023 [cs.CL].
14.Abigail See, Peter J. Liu, Christopher D. Manning (2017). Get To The Point: Summarization with Pointer-Generator Networks. arXiv preprint arXiv:1704.04368 [cs.CL].
15.K. Shetty and J. S. Kallimani, "Automatic extractive text summarization using K-means clustering," 2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), Mysuru, 2017, pp. 1-9, doi: 10.1109/ICEECCOT.2017.8284627.
16.Christian, Hans & Agus, Mikhael & Suhartono, Derwin. (2016). Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications. 7. 285. 10.21512/comtech.v7i4.3746.
17.M. Chandra, V. Gupta and S. K. Paul (2011). A statistical approach for automatic text summarization by extraction. Proc. Int. Conf. CSNT, pp. 268-271.
18.Alexander M. Rush, Sumit Chopra, Jason Weston (2015). A Neural Attention Model for Sentence Summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.
19.Sumit Chopra, Michael Auli, Alexander M. Rush (2016). Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
20.Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio (2014). Neural Machine Translation by Jointly Learning to Align and Translate. arXiv preprint arXiv:1409.0473 [cs.CL].
21.Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017). Attention is All You Need. arXiv preprint arXiv:1706.03762 [cs.CL].
22.Sebastian Ruder (2017). An Overview of Multi-Task Learning in Deep Neural Networks. arXiv preprint arXiv:1706.05098 [cs.LG].
23.Alex Kendall, Yarin Gal, Roberto Cipolla (2017). Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. arXiv preprint arXiv:1705.07115 [cs.CV].
24.P. Bojanowski*, E. Grave*, A. Joulin, T. Mikolov (2016). Enriching Word Vectors with Subword Information. arXiv preprint arXiv:1607.04606 [cs.CL].
25.A. Stent, M. Marge, M. Singhai. (2005). Evaluating evaluation methods for generation in the presence of variation. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 341–351. Springer.
26.T. Zhang, V. Kishore, F. Wu, Kilian Q. Weinberger, Y. Artzi (2019). BERTScore: Evaluating Text Generation with BERT. arXiv preprint arXiv:1904.09675 [cs.CL].
27.Thibault Sellam, Dipanjan Das, Ankur P. Parikh (2020). BLEURT: Learning Robust Metrics for Text Generation. arXiv preprint arXiv:2004.04696 [cs.CL].
28.Chin-Yew Lin (2004). ROUGE: A Package for Automatic Evaluation of Summaries. In Association for Computational Linguistics.
29.Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu (2004). BLEU: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top