跳到主要內容

臺灣博碩士論文加值系統

(34.204.198.73) 您好!臺灣時間:2024/07/16 17:41
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張易筠
研究生(外文):CHANG,YI-YUN
論文名稱:應用BERT語言模型於顧客評論之多面向情緒分析
論文名稱(外文):Applying BERT Language Model to Multi-aspect Sentiment Analysis of Customer Reviews
指導教授:黃承龍黃承龍引用關係
指導教授(外文):HUANG,CHENG-LUNG
口試委員:徐煥智黃文楨黃承龍
口試委員(外文):SHYUR,HUAN-JYHHUANG,WEN-CHENHUANG,CHENG-LUNG
口試日期:2022-06-23
學位類別:碩士
校院名稱:國立高雄科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:78
中文關鍵詞:遷移學習語言模型多類別分類多標籤分類多輸出分類
外文關鍵詞:transfer learninglanguage modelmulti-class classificationmulti-label classificationmulti-output classificatio
相關次數:
  • 被引用被引用:8
  • 點閱點閱:841
  • 評分評分:
  • 下載下載:186
  • 收藏至我的研究室書目清單書目收藏:0
現今隨著網際網路的發展,在網路世界裡充斥著許多用戶訊息足跡,包含評論、用戶資料、瀏覽紀錄等;因有這些足跡,許多企業紛紛透過人工智慧與大數據分析技術,來分析用戶行為和語意,藉以了解顧客對於自家的評價和滿意度。
本研究以旅遊飯店業者為例,使用爬蟲技術與自然語言處理,抓取訂房飯店的旅客評論,分析旅客評論的語意和情緒,並建立情緒、面向、面向情緒等三個分類系統,透過運用這些分類系統,讓企業能夠更加了解旅客評論的反饋,從而優化飯店的服務品質。
本研究進行之任務有三項: (1)評論的情緒之多類別分類,(2)評論的面向之多標籤分類,(3)評論的各面向情緒之多輸出分類。實驗的系統使用四個預訓練語言模型來搭配遷移學習,分別是 BERT、RoBERTa、DistilBERT、ALBERT,並評估各預訓練語言模型在於多類別、多標籤、多輸出分類系統的準確度。
本研究採取嚴格準確率作為評估指標,嚴格準確率的算法是將正確預測樣本數除以總樣本數來計算,實驗結果顯示 BERT 預訓練語言模型在於評論情緒的多類別分類上有達 96%的準確率,而在多標籤分類上,BERT 與 ALBERT 預訓練語言模型達到了 91%的嚴格準確率,在各面向情緒的多輸出分類,BERT 與RoBERTa 為最好,嚴格準確率達 85%,DistilBERT 雖在三種分類任務上所花費時間最短,但三種分類任務的準確度都未能比其他模型好。
Nowadays, with the development of the Internet, the Internet world is filled with many users’ information footprints. It includes comments, user data, and browsing records. Because of these footprints, many enterprises have devoted themselves to artificial intelligence and big data analysis. Using these two technologies for analyzing users’ behaviors and semantics then taking the analyzed information as an improvement indicator for the enterprises. Furthermore, to understand how customers evaluate the enterprises.
Therefore, this study takes tourist hotels as an example. It uses crawler technology and natural language processing to capture travelers’ feedback from their stay, analyze the semantics and emotions from the feedback and classify them into three categories which are emotion, orientation, and emotion for each aspect. Through the classification of emotions, orientations, and various emotions, enterprises are helped to know more about travelers’ needs from the comments. Besides, enterprises can advance their future service as well.
Three tasks will be carried out in this study which include (1) multi-category classification of reviews’ sentiments, (2) review-oriented multi-label classification, and (3) review’s sentiment-oriented multi-category classification Multi-output classification. Meanwhile, use four pre-trained language models to match transfer learning which are named BERT, RoBERTa, DistilBERT, and ALBERT. To examine the accuracy in multi-category, multi-label, and multi-output classification experiments of every pre-trained language model and also to check if there is a certain degree of accuracy in judging feedback from travelers by the above aspects.
This study adopts a strict accuracy rate as the evaluation index. The algorithm of strict accuracy rate is calculated by the number of correctly predictable samples/total number of samples. The experimental results show that the BERT pre-trained language model has an accuracy rate of 96% in the multi-category classification of comment emotions, and in the multi-label classification, the BERT and ALBERT pre-trained language models achieved a strict accuracy rate of 91%. Finally, in the multi-output classification for each emotion, BERT and RoBERTa are the best, with a strict accuracy rate of 85%. Although DistilBERT is in the time spent on the three classification tasks is the shortest, the accuracy of the three classification tasks is not better than the other models.
摘要 i
Abstract ii
誌謝 iv
目錄 v
表目錄 viii
圖目錄 x
壹、緒論 1
1.1 研究背景與動機1
1.2 研究目的3
1.3 論文流程3
1.4 論文架構4
貳、文獻探討 6
2.1 遷移學習(Transfer Learning)6
2.1.1 多類別分類(Multi-class Classification)6
2.1.2 多標籤分類(Multi-label Classification)7
2.1.3 多輸出分類(Multi-output Classification)8
2.2 Transformer9
2.2.1 編碼器(Encoder)10
2.2.2 解碼器(Decoder) 10
2.2.3 自注意力機制(self-attention mechanism)10
2.2.4 多頭注意力機制 (Multi-Head Attention) 12
2.3 BERT(Bidirectional Encoder Representations from Transformers)12
2.3.1 遮罩語言模型(Masked LM) 13
2.3.2 下一句子預測(Next Sentence Prediction, NSP)13
2.3.3 下游任務(Fine-Tuning)13
2.3.4 Embedding 層14
2.4 RoBERTa(Robustly optimized BERT approach)15
2.5 DistilBERT (Distilled version of BERT) 16
2.6 ALBERT (A Lite BERT for Self-supervised Learning of Language
Representations)16
2.6.1 參數精簡技術(Parameters Reduction Techniques)17
2.6.2 句間連貫損失(Inter-sentence coherence loss)18
參、實驗計畫與資料集 19
3.1 實驗分類任務19
3.1.1 評論的情緒之多類別分類19
3.1.2 評論的面向之多標籤分類20
3.1.3 評論的各面向情緒之多輸出分類20
3.1.4 四個預訓練模型21
3.1.5 評估方式22
3.1.6 評估指標之計算公式23
3.2 實驗資料來源25
3.2.1 評論的情緒分類資料集25
3.2.2 評論的面向分類資料集26
3.2.3 評論的各面向情緒資料集26
3.3 實驗模型資訊28
3.4 實驗環境與版本29
肆、旅客評論整體 4 類情緒多類別分類實驗 30
4.1 資料預處理30
4.2 實驗參數32
4.3 實驗結果分析35
伍、旅客評論 6 面向多標籤分類實驗 37
5.1 資料預處理37
5.2 實驗參數39
5.3 實驗結果分析42
陸、旅客評論 6 面向之 4 類情緒多輸出分類實驗 45
6.1 資料預處理45
6.2 模型輸出之選擇47
6.3 情緒與面向權重參調實驗49
6.3.1 實驗參數50
6.4 實驗結果分析53
柒、結論 57
7.1 結論57
7.1.1 情境展示58
7.2 未來展望60
參考文獻 61
附錄一 63
1. Jiahao Bu, Lei Ren, Shuang Zheng, Yang Yang, Jingang Wang, Fuzheng Zhang, Wei
Wu.(2021) “ASAP: A Chinese Review Dataset Towards Aspect Category Sentiment
Analysis and Rating Prediction”,arXiv:2103.06605

2. Tao, J., Fang, X.(2020) “Toward multi-label sentiment analysis: a transfer learning based approach.”, https://doi.org/10.1186/s40537-019-0278-0

3. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan
N. Gomez, Lukasz Kaiser, Illia Polosukhin.(2017) “Attention Is All You Need ” ,arXiv:1706.03762.

4. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova.(2018) “Bert:
pre-training of deep bidirectional transformers for language understanding.”,
arXiv:1810.04805.

5. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M.,
Zettlemoyer, L. and Stoyanov, V. (2019) “ RoBERTa: A Robustly Optimized
BERT Pretraining Approach ” , arXiv:1907.11692v1.

6. Sanh, V., Debut, L., Chaumond, J. and Wolf, T. (2019) “ DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter ” , arXiv:1910.01108v4.

7. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma,
Radu Soricut.(2019) “ ALBERT: A Lite BERT for Self-supervised Learning of
Language Representations”, arXiv:1909.11942

8. Lee, M. (2019) “ 進 擊 的 BERT : NLP 界 的 巨 人 之 力 與 遷 移 學 習 ” ,
https://leemeng.tw/attack_on_bert_transfer_learning_in_nlp.html.

9. Alammar, J. (2018) “ A Visual Guide to Using BERT for the First Time”,
https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/

10. Google A.I.(2018) “bert-base-chinese”, https://huggingface.co/bert-base-chinese

11. 哈工大訊飛聯合實驗室 .(2019) “chinese-roberta-wwmext”,https://huggingface.co/hfl/chinese-roberta-wwm-ext

12. Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire.(2020)“distilbert-base-zhcased”,https://huggingface.co/Geotrend/distilbert-base-zh-cased

13. Mu Yang at CKIP.(2020) “ albert-basechinese”,https://huggingface.co/ckiplab/albert-base-chinese
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊