跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.17) 您好!臺灣時間:2025/09/03 04:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:簡延銜
研究生(外文):JIAN, YAN-XIAN
論文名稱:基於 Transformer 之語言模型應用於新聞分類與財經新聞生成
論文名稱(外文):Transformer-based Language Model Applied to News Classification and Financial News Generation
指導教授:黃承龍黃承龍引用關係
指導教授(外文):HUANG, CHENG-LUNG
口試委員:黃文楨張弘毅張育仁黃承龍
口試委員(外文):HUANG, WEN-CHENCHANG, HUNG-YICHANG, YU-JENHUANG, CHENG-LUNG
口試日期:2021-06-16
學位類別:碩士
校院名稱:國立高雄科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:66
中文關鍵詞:新聞類別分類新聞文章生成遷移學習預訓練語言模型
外文關鍵詞:Classification of NewsText Generation of Financial NewsTransfer LearningPre-trained Language Model
相關次數:
  • 被引用被引用:3
  • 點閱點閱:915
  • 評分評分:
  • 下載下載:127
  • 收藏至我的研究室書目清單書目收藏:0
現今資訊交流的速度越來越快,每天都有許多新聞產出,一個新聞部門每天都需要處理上百篇甚至是數千篇的新聞,相當耗時。故本研究進行三項實驗,分別是新聞類別的分類實驗、財經新聞文章生成實驗以及新聞標題生成實驗,期盼能給予新聞記者初步寫稿時之輔助。
第一項研究為文本分類,研究透過傳統分類器、卷積神經網路以及Transformer架構的語言模型BERT、DistilBERT、RoBERTa、XLNet來搭配遷移學習,並比較何種語言模型在新聞文本分類會有較好的效果。
第二項研究為財經新聞文章生成,使用Transformer架構的GPT-2 Chinese 進行實驗,目的為協助新聞編輯擬定初稿,給予文稿之初步的參考與建議,內容包含人事時地物,並運用深度學習之方式進行新聞文本之生成。
第三項研究為財經新聞標題生成,也是使用GPT-2 Chinese 進行實驗,目的為協助新聞編輯,進行新聞文章標題命名的輔助方案。
實驗結果顯示,文本分類的任務上,語言模型優於傳統模型;而語言模型裡面,DistilBERT訓練所需時間最短,但BERT準確率最高,優於DistilBERT、RoBERTa、XLNet,可看出BERT在遷移學習中,更能達到提高文本分類準確率的效果。財經新聞文章生成實驗,搭配預訓練的通用中文模型,其生成效果較好;新聞文章內容之生成文意尚屬通順,能夠作為新聞的初稿參考使用,而財經新聞標題生成實驗中,部分新聞標題尚能與新聞文章呼應。
Nowadays, the speed of information exchange is getting faster and faster, and there are various news output every day. A news department needs to process hundreds or even thousands of news every day, which is quite time-consuming.
Therefore,this study conducted three experiments, namely, news category classification experiment, financial news article generation experiment, and news headline generation experiment period to inform reporters and assist in writing articles.
The first study is text classification. It uses transfer learning to study the language models BERT, DistilBERT, RoBERTa, and XLNet that run in traditional classifiers, neural network paths, and converter architectures. The more popular language models have some good effects on news text classification. .
The second study is the generation of financial news articles, using the GPT-2 Chinese experiment based on the Transformer architecture. The purpose is to help news editors prepare the first draft, provide references and suggestions for auxiliary selection, including personnel, time, characteristics and the use of deep learning. Method of generating news text.
The third study is the generation of financial news headlines, using GPT-2 Chinese for experiments. The purpose is to help news editors carry out another auxiliary scheme for news headlines.
The experimental results show that in the task of text classification, the language model simulates the traditional model; and in the language model, DistilBERT training has the highest peak time, but the accuracy of BERT, simple DistilBERT, RoBERTa, XLNet, can improve BERT in migration learning. The effect of text classification accuracy. Financial news article experiment, the pre-trained general Chinese model, its generation effect is good; the textual meaning of the news article content is still smooth, and can be used as a reference for the first draft of news, while in the financial news headline generation experiment, some news headlines are still Can echo news articles.

摘要 i
Abstract ii
誌謝 iv
目錄 v
表目錄 vii
圖目錄 viii
壹、 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 論文架構 3
貳、 文獻探討 5
2.1 BERT (Bidirectional Encoder Representations from Transformers) 5
2.1.1 MLM (Masked Language Model) 5
2.1.2 NSP (Next Sentence Prediction) 5
2.1.3 Embedding層 6
2.2 遷移學習 (Transfer Learning) 6
2.3 Transformer 7
2.3.1 編碼器 (Encoder) 8
2.3.2 解碼器 (Decoder) 8
2.4注意力機制 (Attention Mechanism) 9
2.4.1 自注意力機制 (Self-Attention Mechanism) 9
2.4.2 多頭注意力機制 (Multi-Head Attention Mechanism) 10
2.5 DistilBERT (Distilled version of BERT) 10
2.6 RoBERTa (Robustly optimized BERT approach) 12
2.7 XLNet 12
2.7.1 自迴歸 (Auto Regressive, AR) 13
2.7.2 Transformer-XL (XL,eXtra Long) 14
2.8 GPT-2 (Generative Pre-Training 2) 14
參、 實驗計畫與資料集 16
3.1 實驗資料來源 16
3.1.1新聞分類實驗 16
3.1.2新聞生成實驗 17
3.2 實驗設計與模型 18
3.2.1新聞分類任務之資料前處理 18
3.3 實驗環境 20
肆、 新聞類別分類實驗 23
4.1實驗參數 23
4.2 實驗結果分析 24
伍、 財經新聞文章生成實驗 27
5.1 實驗參數 28
5.2 實驗結果分析 29
陸、 財經新聞標題生成實驗 44
6.1 實驗參數 44
6.2 實驗結果分析 45
柒、 總結論 51
7.1結論 51
7.2未來研究方向 52
參考文獻 54
1.Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2019) “ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding ” arXiv:1810.04805v2.
2.Sanh, V., Debut, L., Chaumond, J. and Wolf, T. (2020) “ DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter ” , arXiv:1910.01108v4.
3.Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L. and Stoyanov, V. (2019) “ RoBERTa: A Robustly Optimized BERT Pretraining Approach ” , arXiv:1907.11692v1.
4.Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. and Le, Q.V. (2020) “ XLNet: Generalized Autoregressive Pretraining for Language Understanding ” , arXiv:1906.08237v2.
5.Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. (2017) “ Attention Is All You Need ” , arXiv:1706.03762.
6.Hinton, G., Vinyals, O. and Dean, J. (2015) “ Distilling the Knowledge in a Neural Network ” , arXiv:1503.02531v1.
7.Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q.N. and Salakhutdinov, R. (2019) “ Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context ” , arXiv:1901.02860v3.
8.Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. and Sutskever, H.(2019) “ Language Models are Unsupervised Multitask Learners ” , https://github.com/openai/gpt-2.
9.Katharopoulos, A., Vyas, A., Pappas, N. and Fleuret, F. (2020) “ Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention ” , arXiv:2006.16236v3.
10.Howard, J. and Ruder, S. (2018) “ Universal Language Model Fine-tuning for Text Classification ” , arXiv:1801.06146v5.
11.Xu, L., Zhang, X.W. and Dong. Q.Q. (2020) “ CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model ” , arXiv:2003.01355v2.
12.吳晨皓、黃承龍,2020,BERT與GPT-2分別應用於刑事案件之罪名分類及判決書生成。
13.陳世榮,2012,“社會科學研究中的文字探勘應用:以文意為基礎的文件分類及其問題” 。
14.吳孟瑾、傅詞源、李佳衛、張耀中,2019, “利用自然語言處理進行自動新聞分類之研究” 。
15.Lee, M. (2019) “進擊的BERT:NLP 界的巨人之力與遷移學習” , https://leemeng.tw/attack_on_bert_transfer_learning_in_nlp.html.
16.Cui, Y.M. (2019) “中文XLNet” , https://github.com/ymcui/Chinese-XLNet.
17.Cui, Y.M. (2019) “中文BERT-wwm” , https://github.com/ymcui/Chinese-BERT-wwm.
18.Xu, L. (2019) “中文預訓練RoBERTa模型” , https://github.com/brightmart/roberta_zh.
19.Du, Z.Y. (2019) “GPT2-Chinese” , https://github.com/Morizeyao/GPT2-Chinese.
20.Yang, J.X. (2019) “GPT2-chitchat” , https://github.com/yangjianxin1/GPT2-chitchat.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top