跳到主要內容

臺灣博碩士論文加值系統

(44.201.97.0) 您好!臺灣時間:2024/04/13 10:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:周均霖
論文名稱:結合預訓練之影像與YouTube特徵應用於留言之多維度情感分析
論文名稱(外文):Combining Pretrained Image and YouTube Features Embedding on Dimensional Sentiment Classification of YouTube Comments
指導教授:吳政隆吳政隆引用關係
指導教授(外文):WU, JHENG-LONG
口試委員:林俊杰蘇明祥
口試委員(外文):LIN, JYUN-JIESU, MING-HSIANG
口試日期:2022-07-18
學位類別:碩士
校院名稱:東吳大學
系所名稱:巨量資料管理學院碩士學位學程
學門:電算機學門
學類:電算機應用學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:103
中文關鍵詞:情感分析詞嵌入預訓練模型深度學習自然語言處理
外文關鍵詞:Sentiment AnalysisWord EmbeddingPretrained ModelDeep Learningnatural language processing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:184
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
YouTube是當今十分受歡迎的社交平台,本研究基於協助YouTuber能更好地了解使用者想法,進而創造出更加優質的影片內容的同時,還能做到維護留言區自由且善良的討論風氣,設計了5種多維度情感分析指標:影片內容偏好程度指標、YouTuber偏好程度指標、激動程度指標、腥羶色指標與諷刺指標,並蒐集了12,500筆留言進行標記與與整理,建立YouTube影片留言之多維度中文情感資料集作為本研究之資料集。本研究為提升指標的分類效果,提出預訓練特徵向量的萃取方法與使用方法,使用ResNet-152對影片的封面影像、Word2Vec對頻道的影片標題進行訓練,最終輸出3種預訓練特徵向量:圖向量、YouTuber-Sociality Embedding、YouTube-Channel Embedding,並將此3種特徵向量與13種模型進行結合後對7項分類任務進行指標預測。最終研究數據表明,本研究所提出的預訓練附加特徵能提供模型更多的學習特徵,進而有效地提升各項模型的分類效果,其中更以預訓練向量結合MMBT模型之表現最佳,分類效果高於其他12種模型。
As one of the most used social media platforms worldwide, YouTube has more than 2.6 billion monthly active users. Although YouTube has provided some statistical tools for video content creators on YouTube, usually called YouTuber, to measure their video performances, there is lack of textual analysis tool to analyze user comments, which is very important reference to realize audience preferences. In past studies, many methods are used to identify semantic polarities (positive, negative, neutral) or detect audience's emotion classification, but none of the studies distinguish the audience preference of relevance between video content and Youtubers nevertheless. To support YouTubers clearly understand audience's thoughts, moreover, improve the quality of their video content, this research collects 12,500 pieces of comments and annotates as a 5 indicators Chinese corpus: video preferences, YouTuber preferences, excitement levels, sexual content and irony content. In order to improve the classifiers ability, this research introduce 3 feature vectors: image vector, YouTuber-Sociality Embedding and YouTube-Channel Embedding, which are trained by pretrained models ResNet-152 and Word2Vec separately. With adding these pretrained feature vectors into 13 kinds of models, the results present that the sentiment detection tasks do improve significantly.
致謝 i
摘要 ii
Abstract iii
目錄 iv
圖目錄 vi
表目錄 vii

第一章 緒論 9
第一節 研究背景 9
第二節 研究動機與目的 10
第三節 研究貢獻 11
第四節 研究流程 11

第二章 文獻探討 13
第一節 情感分析 13
第二節 文本分類技術 13
第三節 情感分析應用於YouTube之研究 14
第四節 多維度情感分析指標 16
一、影片內容偏好程度指標、YouTuber偏好程度指標 16
二、激動程度指標 17
三、腥羶色指標、諷刺指標 17

第三章 研究方法 18
第一節 資料蒐集與過濾 19
第二節 資料標記 19
一、多維度的情感分析指標 19
二、標記一致性評估 22
三、標記資料整理 23
第三節 資料前處理 23
一、資料清理 23
二、斷詞、命名實體辨識 24
第四節 預訓練特徵向量 24
一、圖向量萃取 24
二、YouTube特徵向量萃取 25
三、預訓練特徵之使用 26
四、結合預訓練特徵向量於模型訓練 28
五、多維度情感分類任務 30

第四章 實驗結果與分析 32
第一節 實驗數據 32
一、資料來源 – 25個頻道 32
二、YouTube-Channel Embedding頻道分佈 33
三、YouTuber-Sociality Embedding頻道分佈 34
四、標記結果一致性評估分數 35
五、資料集 36
六、斷詞後資料之敘述性統計 39
第二節 實驗設計 40
一、實驗使用之分類模型 40
二、模型參數設置 41
第三節 實驗結果與分析 42
一、模型分類結果分析 42
二、整體任務結果分析 50
三、預訓練向量之使用分析 60
第四節 管理意涵 62

第五章 結論 64
第一節 總結 64
第二節 研究限制 64
第三節 未來展望 65

參考文獻 66
附錄一 Mn-org、Mn-iyc、Mn-iy’c、Mn-iy、Mn-iy’ 評估分數 71
附錄二 Mn-ic、Mn-i、Mn-y、Mn-y’、Mn-c評估分數 87


外文參考文獻
Agbehadji, I. E., & Ijabadeniyi, A. (2020). Approach to sentiment analysis and business communication on Social Media. Bio-Inspired Algorithms for Data Streaming and Visualization, Big Data Management, and Fog Computing, 169–193. https://doi.org/10.1007/978-981-15-6695-0_9
Araque, O., Corcuera-Platas, I., Sánchez-Rada, J. F., & Iglesias, C. A. (2017). Enhancing deep learning sentiment analysis with ensemble techniques in Social Applications. Expert Systems with Applications, 77, 236–246. https://doi.org/10.1016/j.eswa.2017.02.002
Aydoğan, M., & Karci, A. (2020). Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification. Physica A: Statistical Mechanics and Its Applications, 541. https://doi.org/10.1016/j.physa.2019.123288
Bhuiyan, H., Ara, J., Bardhan, R., & Islam, M. R. (2017). Retrieving YouTube video by sentiment analysis on user comment. In 2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) (pp. 474–478). IEEE.
Ceron, A., Curini, L., & Iacus, S. M. (2014). Using sentiment analysis to monitor electoral campaigns. Social Science Computer Review, 33(1), 3–20. https://doi.org/10.1177/0894439314521983
Chakravarthi, B. R., Priyadharshini, R., Ponnusamy, R., Kumaresan, P. K., Sampath, K., Thenmozhi, D., Thangasamy, S., Nallathambi, R., & McCrae, J. P. (2021). Dataset for Identification of Homophobia and Transophobia in Multilingual YouTube Comments. arXiv.org. Retrieved from https://doi.org/10.48550/arXiv.2109.00227
Chen, Y.-L., Chang, C.-L., & Yeh, C.-S. (2017). Emotion classification of YouTube videos. Decision Support Systems, 101, 40–50. https://doi.org/10.1016/j.dss.2017.05.014
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G. (2020). Revisiting Pre-Trained Models for Chinese Natural Language Processing. In Findings of the Association for Computational Linguistics: EMnLP (pp. 657–668). ACL.
Cunha, A. A. L., Costa, M. C., & Pacheco, M. A. C. (2019). Sentiment Analysis of YouTube Video Comments Using Deep Neural Networks. In the 18th International Conference on Artificial Intelligence and Soft Computing (ICAISC 2019), (pp. 561–570). Cham, Switzerland; Springer.
De Veirman, M., Cauberghe, V., & Hudders, L. (2017). Marketing through Instagram influencers: The impact of number of followers and product divergence on brand attitude. International Journal of Advertising, 36(5), 798–828. https://doi.org/10.1080/02650487.2017.1348035
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional Transformers for language understanding. arXiv.org. Retrieved from https://arxiv.org/abs/1810.04805
Dogra, V., Singh, A., Verma, S., Kavita, Jhanjhi, N. Z., & Talib, M. N. (2021). Analyzing distilbert for sentiment classification of Banking Financial News. Intelligent Computing and Innovation on Data Science, 501–510. https://doi.org/10.1007/978-981-16-3153-5_53
Gaenssle, S., & Budzinski, O. (2020). Stars in social media: New light through old windows? Journal of Media Business Studies, 18(2), 79–105. https://doi.org/10.1080/16522354.2020.1738694
Google. (2019, January). Taiwanese turn to YouTube for online video content. Thinking with Google. Retrieved from https://www.thinkwithgoogle.com/intl/en-apac/marketing-strategies/video/taiwanese-turn-youtube-online-video-content
He, K., Zhang , X., Ren , S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–779). IEEE.
Hudders, L., De Jans, S., & De Veirman, M. (2020). The commercialization of Social Media Stars: A Literature Review and conceptual framework on the strategic use of Social Media influencers. International Journal of Advertising, 40(3), 327–375. https://doi.org/10.1080/02650487.2020.1836925
Johnson, J. (2021, September 10). Internet users in the world 2021. Statista. Retrieved April 24, 2022, from https://www.statista.com/statistics/617136/digital-population-worldwide
Kavitha, K. M., Shetty, A., Abreo, B., D’Souza, A., & Kondana, A. (2020). Analysis and classification of user comments on YouTube videos. Procedia Computer Science, 177, 593–598. https://doi.org/10.1016/j.procs.2020.10.084
Kemp, S. (2022, February 15). Digital 2022: Taiwan - datareportal – global digital insights. DataReportal. Retrieved from https://datareportal.com/reports/digital-2022-taiwan
Kiela, D., Bhooshan, S., Firooz, H., Perez, E., & Testuggine, D. (2020, November 12). Supervised multimodal bitransformers for classifying images and text. arXiv.org. Retrieved from https://arxiv.org/abs/1909.02950
Kurniawan, S., Kusumaningrum, R., & Ehonia Timu, M. (2018). Hierarchical Sentence Sentiment Analysis Of Hotel Reviews Using The Naïve Bayes Classifier. In 2018 2nd International Conference on Informatics and Computational Sciences (ICICoS). IEEE.
Lin, M. (2021). Classifying Comments on YouTube via Pre-training of Deep Bidirectional Transformers for Language Understanding (dissertation). University of North Carolina at Chapel Hill. https://doi.org/10.17615/0d46-1564
M. Alayba, A., Palade, V., England, M., & Iqbal, R. (2017). Arabic language sentiment analysis on health services. In 2017 1st International Workshop on Arabic script analysis and recognition (ASAR) (pp. 114–118). IEEE.
Mikolov, T., Yih, W.-tau, & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL) (pp. 746–751).
Mohammad, S. M. (2016). Sentiment Analysis: Detecting Valence, Emotions, and Other Affectual States from Text. Emotion Measurement, 201–237. https://doi.org/10.1016/b978-0-08-100508-8.00009-6
Munezero, M., Montero, C. S., Sutinen, E., & Pajunen, J. (2014). Are they different? affect, feeling, emotion, sentiment, and opinion detection in text. IEEE Transactions on Affective Computing, 5(2), 101–111. https://doi.org/10.1109/taffc.2014.2317187
Nagarajan, P. (2021, December 10). Data visualization using PCA and T-Sne -Amazon Fine Food Review Dataset. Medium. https://prabhitha3.medium.com/data-visualization-using-pca-and-t-sne-amazon-fine-food-review-dataset-50887e1bf9d9
Omnicore Agency. (2022, March 14). YouTube by the numbers: Stats, Demographics & Fun Facts. Digital Marketing Blog by Omnicore™. Retrieved April 25, 2022, from https://www.omnicoreagency.com/youtube-statistics/
Onyenwe, I., Nwagbo, S., Mbeledogu, N., & Onyedinma, E. (2020). The impact of political party/candidate on the election results from a sentiment analysis perspective using #AnambraDecides2017 tweets. Social Network Analysis and Mining, 10(1). https://doi.org/10.1007/s13278-020-00667-2
Otter, D. W., Medina, J. R., & Kalita, J. K. (2021). A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 32(2), 604–624. https://doi.org/10.1109/tnnls.2020.2979670
Perikos, I., & Hatzilygeroudis, I. (2017). Aspect based sentiment analysis in social media with classifier ensembles. In 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS). IEEE.
Poché, E., Jha, N., Williams, G., Staten, J., Vesper, M., & Mahmoud, A. (2017). Analyzing User Comments on YouTube Coding Tutorial Videos. In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC) (pp. 196–206). IEEE.
Prabowo, R., & Thelwall, M. (2009). Sentiment analysis: A combined approach. Journal of Informetrics, 3(2), 143–157. https://doi.org/10.1016/j.joi.2009.01.003
Rezaeinia, S. M., Rahmani, R., Ghodsi, A., & Veisi, H. (2019). Sentiment analysis based on improved pre-trained word embeddings. Expert Systems with Applications, 117, 139–147. https://doi.org/10.1016/j.eswa.2018.08.044
Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). Distilbert, a distilled version of Bert: Smaller, faster, cheaper and lighter. Retrieved from https://doi.org/10.48550/arXiv.1910.01108
Savigny, J., & Purwarianti, A. (2017). Emotion Classification on YouTube Comments using Word Embedding. In 2017 International Conference on Advanced Informatics, Concepts, theory, and applications (ICAICTA). IEEE.
Schultes, P., Dorner, V., & Lehner, F. (2013). Leave a Comment! An In-Depth Analysis of User Comments on YouTube. In 2013 Wirtschaftsinformatik Proceedings.
Schwemmer, C., & Ziewiecki, S. (2018). Social Media Sellout: The increasing role of product promotion on YouTube. Social Media + Society, 4(3). https://doi.org/10.1177/2056305118786720
Singh, M., Jakhar, A. K., & Pandey, S. (2021). Sentiment analysis on the impact of coronavirus in social life using the bert model. Social Network Analysis and Mining, 11(1). https://doi.org/10.1007/s13278-021-00737-z
Sousa, M. G., Sakiyama, K., Rodrigues, L. de S., Moraes, P. H., Fernandes, E. R., & Matsubara, E. T. (2019). BERT for Stock Market Sentiment Analysis. In 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 1597–1601). IEEE.
Statista Research Department. (2022, March 8). Most popular social networks worldwide as of January 2022, ranked by number of monthly active users. Statista. Retrieved April 25, 2022, from https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users
Sudarsa, D., Kumar.P, S., & Jagajeevan Rao, L. (2018). Sentiment analysis for social networks using Machine Learning Techniques. International Journal of Engineering & Technology, 7(2.32), 473–476. https://doi.org/10.14419/ijet.v7i2.32.16271
Suler, J. (2004). The online disinhibition effect. CyberPsychology & Behavior, 7(3), 321–326. https://doi.org/10.1089/1094931041291295
Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., & Qin, B. (2014). Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification. In 52nd annual meeting of the association for computational linguistics (ACL 2014) (pp. 1555–1565). ACL.
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544–2558. https://doi.org/10.1002/asi.21416
Thet, T. T., Na, J.-C., & Khoo, C. S. G. (2010). Aspect-based sentiment analysis of movie reviews on discussion boards. Journal of Information Science, 36(6), 823–848. https://doi.org/10.1177/0165551510388123
Tripto, N. I., & Ali, M. E. (2018). Detecting Multilabel Sentiment and Emotions from Bangla YouTube Comments. In 2018 International Conference on Bangla Speech and Language Processing (ICBSLP) (pp. 1–6).
Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. In Human Language Technology Conference and Conference on empirical methods in natural language processing (HLT) (pp. 347–354). ACL.
Yin, Y., & Jin, Z. (2015). Document sentiment classification based on the word embedding. Proceedings of the 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering 2015, 456–461. https://doi.org/10.2991/icmmcce-15.2015.92
YouTube. (n.d.). YouTube for Press. YouTube for Press - YouTube Official Blog. https://blog.youtube/press/
Zhang, X., & Zheng, X. (2016). Comparison of Text Sentiment Analysis Based on Machine Learning. In 2016 15th International Symposium on Parallel and Distributed Computing (ISPDC) (pp. 230–233). IEEE.

中文參考文獻
林 明侖. (2021). 在網路上留言意淫騷擾文字,是否涉及刑事犯罪責任?. 林明侖律師法律事務所. https://mll.tw/cc-19
楊 少夫. (2019, September 24). 【消費人類學】Youtuber的共感經濟|天下雜誌. 天下雜誌. https://www.cw.com.tw/article/5095562

電子全文 電子全文(網際網路公開日期:20270926)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊