跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.173) 您好!臺灣時間:2025/01/17 02:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:宋宏鈞
研究生(外文):SUNG,HUNG-CHUN
論文名稱:建立一個用於語義分析的自動可視化工具
論文名稱(外文):Creating An Automatic Visualization Tool For Semantic Analysis
指導教授:黃文楨黃文楨引用關係
指導教授(外文):HUANG, WEN-CHEN
口試委員:殷堂凱黃承龍黃文楨
口試委員(外文):YIN, TANG-KAIHUANG, CHENG-LUNGHUANG, WEN-CHEN
口試日期:2022-06-24
學位類別:碩士
校院名稱:國立高雄科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:57
中文關鍵詞:生成式摘要spaCy英文理解視覺化工具語意分析
外文關鍵詞:Generated abstractsspaCyEnglish comprehension visualization toolssemantic analysis
相關次數:
  • 被引用被引用:0
  • 點閱點閱:165
  • 評分評分:
  • 下載下載:24
  • 收藏至我的研究室書目清單書目收藏:0
多虧於網路的發達與科學技術的進步,生活中隨時都可以接觸到龐大的資訊,瀏覽網路新聞就可以知道最近的大小事,使用社交軟體就能夠得知距離遙遠的朋友的生活近況,透過搜尋引擎便能找到想要了解的資訊,不用再跑到圖書館查找書籍。而在這可以輕易取得大量資訊的背景下,越來越多人開始尋找所謂的懶人包,只要能幫助理解,不用耗費時間閱讀大量的文字成了人們追求的目標。因此,本研究針對文字量龐大的英文論文,設計了一個可以幫助理解內文的可視化工具。
在進行可視化生成圖之前,先透過自動生成摘要壓縮文章的文字量,只保留最具代表性的內容。本研究利用T5(Text-to-Text Transfer Transformer)的摘要生成模型,將論文中摘要的段落重新產生成更簡短的字句。透過ROUGE評量確認生成的字句與原本的文義無嚴重的落差。本研究在視覺化字句的步驟為先對字句做斷詞,再進行詞性分析以及依存關係分析。再對分析結果進行語法判斷,經過語法判斷後,便將其結果以圖的方式呈現,最後利用圖形中的節點符合文章段落中關鍵字的數量,與總節點數的比例當作評測分數,來確認圖形沒有偏離文章所描述的主題,具有可參考性。由於本研究專注於英文論文上,因此,在語法判斷中皆是利用英文文法之特性進行判斷,以符合視覺化文義的可讀性,達成本研究輔助閱讀之目的。
本研究之貢獻,主要為實務層面之研究貢獻,透過可視化工具來輔助使用者於閱讀內文前取得文章的大意。也幫助搜尋資料的使用者更方便的確認搜尋中的文章是否符合需求。
Due to the development of the Internet and the advancement of science and technology, you can access an increasingly large amount of information at any time. With Internet news, you can find out about current events, and social media allows you to learn about the lives of your friends who live abroad. Search engines allow you to find information you need, so you do not have to visit a library to find books. In a world of abundant information, more and more people are interested in the so-called lazy bag, which helps to understand quickly, without having to read a large amount of text. This has become a goal for many people. In this research, we designed a tool that can be used to visualize English papers with substantial amounts of text.
Prior to generating images, it is necessary to compress the text of the article by automatically creating abstracts and only retaining the most representative information. In this study, the T5 (Text-to-Text Transfer Transformer) abstract generation model is used to construct the shorter sentences and words in the abstract of the paper. ROUGE assessment confirms that there are no significant differences between the generated words and the original text. The procedure for visualizing the words and sentences that resulted from this study is to segment them first. Then, analyze the parts of speech and how they relate to one another. After making grammatical judgments, graphs are constructed based on the analysis results. Lastly, the nodes in the graph match the number of keywords in the article paragraph, and the ratio to the total number of nodes is used as the evaluation score to confirm that the graph does not deviate from the theme described in the article and has reference value. To achieve the purpose of research aid reading and to comply with the readability of the visualized text, the characteristics of English grammar are used in grammatical judgments.
Research on this topic is mostly focused on the practical level. A visual aid is used to help readers get the gist of the article before reading the text. It also helps users who are searching for information to more easily confirm whether the articles in the search meet their needs.
摘要 i
ABSTRACT ii
誌謝 iv
壹、緒論 1
1.1研究背景與動機 1
1.2研究目的 3
1.3研究貢獻 3
貳、文獻探討 4
2.1機器閱讀理解 4
2.2自動文本摘要 4
2.3 spaCy 5
2.4 ROUGE 6
2.5依存分析 6
參、研究方法 8
3.1生成摘要 9
3.2字詞分析 12
3.3視覺化字詞 13
肆、結果與討論 23
4.1生成摘要結果 23
4.2字詞分析結果 27
4.3視覺化字詞結果 28
4.4研究結果討論與限制 39
伍、結論與未來展望 41
陸、參考文獻 42
[1]D. Qiu et al., “Machine Reading Comprehension Using Structural Knowledge Graph-aware Network,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, Nov. 2019, pp. 5896–5901. doi: 10.18653/v1/D19-1602.

[2]A. Trischler et al., “NewsQA: A Machine Comprehension Dataset,” arXiv:1611.09830 [cs], Feb. 2017, Accessed: Dec. 21, 2021. [Online]. Available: http://arxiv.org/abs/1611.09830

[3]P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ Questions for Machine Comprehension of Text,” 2016. doi: 10.18653/v1/D16-1264.

[4]O. Levy, M. Seo, E. Choi, and L. Zettlemoyer, “Zero-Shot Relation Extraction via Reading Comprehension,” arXiv:1706.04115 [cs], Jun. 2017, Accessed: Dec. 21, 2021. [Online]. Available: http://arxiv.org/abs/1706.04115

[5]B. Roth, C. Conforti, N. Poerner, S. Karn, and H. Schütze, “Neural Architectures for Open-Type Relation Argument Extraction,” Nat. Lang. Eng., vol. 25, no. 2, pp. 219–238, Mar. 2019, doi: 10.1017/S1351324918000451.

[6]D. Li, B. Hu, Q. Chen, W. Peng, and A. Wang, “Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Nov. 2020, pp. 1427–1438. doi: 10.18653/v1/2020.emnlp-main.111.

[7]Y. Liu, T. Yang, Z. You, W. Fan, and P. S. Yu, “Commonsense Evidence Generation and Injection in Reading Comprehension,” arXiv:2005.05240 [cs], May 2020, Accessed: Dec. 22, 2021. [Online]. Available: http://arxiv.org/abs/2005.05240

[8]M. T. Maybury, “Generating summaries from event data,” Information Processing & Management, vol. 31, no. 5, pp. 735–751, Sep. 1995, doi: 10.1016/0306-4573(95)00025-C.

[9]M. Gambhir and V. Gupta, “Recent automatic text summarization techniques: a survey,” Artif Intell Rev, vol. 47, no. 1, pp. 1–66, Jan. 2017, doi: 10.1007/s10462-016-9475-9.

[10]W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Systems with Applications, vol. 165, p. 113679, 2021.

[11]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, Jun. 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423.

[12]Robertson, Stephen E., Steve Walker, Susan Jones, Micheline Hancock-Beaulieu and Mike Gatford, “Okapi at TREC-3.” TREC (1994), https://www.semanticscholar.org/paper/Okapi-at-TREC-3-Robertson-Walker/d2071c1e4a6030dc0005dbfeefdd196a8b293e84

[13]R. Nogueira, Z. Jiang, and J. Lin, “Document Ranking with a Pretrained Sequence-to-Sequence Model,” arXiv:2003.06713 [cs], Mar. 2020, Accessed: Dec. 23, 2021. [Online]. Available: http://arxiv.org/abs/2003.06713

[14]C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” arXiv:1910.10683 [cs, stat], Jul. 2020, Accessed: Dec. 23, 2021. [Online]. Available: http://arxiv.org/abs/1910.10683

[15]A. Chauhan and Y. Hasija, “NLP-Based Tools for Decoding the Language of Life,” in Proceedings of Emerging Trends and Technologies on Intelligent Systems, Singapore, 2022, pp. 217–233. doi: 10.1007/978-981-16-3097-2_18.

[16]F. N. A. Al Omran and C. Treude, “Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments,” in 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), May 2017, pp. 187–197. doi: 10.1109/MSR.2017.42.

[17]“spaCy • Industrial-strength Natural Language Processing in Python.” https://spacy.io/ (accessed Dec. 22, 2021).

[18]“Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source,” Google AI Blog. http://ai.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html (accessed Dec. 22, 2021).

[19]E. Partalidou, E. Spyromitros-Xioufis, S. Doropoulos, S. Vologiannidis, and K. I. Diamantaras, “Design and implementation of an open source Greek POS Tagger and Entity Recognizer using spaCy,” in 2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI), Oct. 2019, pp. 337–341.

[20]Chin-Yew Lin, “ROUGE: A Package for Automatic Evaluation of Summaries”, 2004, In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.

[21]Daniel Jurafsky & James H. Martin, “Speech and Language Processing Chapter.14 Dependency Parsing”, 2021, https://web.stanford.edu/~jurafsky/slp3/14.pdf

[22]Louis Teo, “The Secret Guide To Human-Like Text Summarization”, July 15, 2021, Retrieved from https://www.topbots.com/guide-to-human-like-text-summarization/?utm_source=ActiveCampaign&utm_medium=email&utm_content=The%20latest%20advances%20in%20text%20summarization%20and%20language%20generation&utm_campaign=Weekly%20Newsletter%2008%2011%202021%20Issue%20248&fbclid=IwAR2RLmWz-g9vSHetb8w9tN8V5LFoeITBBzBFWd0vIA8F9gMqfmC7PZ9Db8Q

[23]Kondalarao Vonteru, NEWS SUMMARY[Dataset], Retrieved from https://www.kaggle.com/datasets/sunnysai12345/news-summary/metadata?resource=download
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top