跳到主要內容

臺灣博碩士論文加值系統

(44.221.70.232) 您好!臺灣時間:2024/05/29 05:06
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:蔡承宏
研究生(外文):TSAI, CHENG-HONG
論文名稱:具人格特質之有個性的聊天機器人
論文名稱(外文):A chatbot with personality traits
指導教授:陳自強陳自強引用關係
指導教授(外文):CHEN, TZU-CHIANG
口試委員:陳自強薛幼苓賴文能
口試委員(外文):CHEN, TZU-CHIANGHSUEH, YU-LINGLIE, WEN-NUNG
口試日期:2022-08-17
學位類別:碩士
校院名稱:國立中正大學
系所名稱:電機工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:42
中文關鍵詞:深度神經網路自然語言處理膠囊網路自注意力機制XLNet精緻高速公路聊天機器人混合專家
外文關鍵詞:deep neural networknatural language processingXLNetcapsule networkself-attention mechanismrefined highwaychatbotmix of experts.
相關次數:
  • 被引用被引用:0
  • 點閱點閱:249
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
近年來隨著深度學習與自然語言處理開發了許多深度類神經網路模型用於預測或分類,聊天機器人也應用了許多相關技術,其中需面臨的困難之一就是賦予聊天機器人個性。為了達成此目標發展出許多方法,例如使用生成式聊天機器人在生成文字時,加入個性的向量,藉此改變生成的句法、語意,或者直接使用具有個性的資料集進行模型訓練。本論文基於檢索式聊天機器人及自動人格特質辨識來達成個性化聊天機器人任務,並提出一種個性匹配計算方式與使用停用詞、語助詞特徵訓練自動人格特質辨識模型以及三種模型自動人格特質辨識架構。在個性匹配的計算方式分別使用了絕對值、均方根以及餘弦相似度。我們透過實驗證明,使用停用詞、語助詞特徵進行訓練是有效的。在提出的三種人格特質辨識架構中都使用擁有預訓練模型的XLNet架構進行詞語意特徵擷取並分別使用自注意力機制、精緻高速公路、混合專家與膠囊網路加強詞語意特徵,最後分別在英文的MyPersonality資料集得到63.01%平均準確率,在中文的MyPersonality資料集得到62.64%平均準確率,在英文的Essays資料集得到70.16%平均準確率,在中文的Essays資料集得到70.08%平均準確率,在英文的FriendsPersona得到65.46%平均準確率,在中文的FriendsPersona得到62.57%平均準確率,在英文的Fusion資料集得到64.79%平均準確率,在中文的Fusion資料集得到65.46%平均準確率。
In recent years, with deep learning and natural language processing, many deep neural network models have been developed for prediction or classification, and chatbots have also applied many related technologies. One of the difficulties that needs to be faced is to give chatbots personality. In order to achieve this goal, many methods have been developed, such as using generative chatbots to add personality vectors when generating text, to understand the generated syntax and semantics, or to directly use data sets with personality for model training. This paper is based on retrieval chatbots and automatic personality trait recognition to achieve personalized chatbot tasks, and proposes three personality matching calculation methods and uses stop words, punctuation marks, auxiliary words to train automatic personality trait recognition models and three models of automatic personality Trait Identification Framework. In the calculation of personality matching, absolute value, root mean square and cosine similarity are used respectively. We experimentally demonstrate that training with stopwords, punctuation, and auxiliary word features is effective. In the three proposed personality trait recognition architectures, the XLNet architecture with pre-trained model is used to extract the word sense features, and the self-attention mechanism, refined highway, hybrid expert and capsule network are used to strengthen the word sense features respectively. The English MyPersonality dataset has an average accuracy of 63.01%, the Chinese MyPersonality dataset has an average accuracy of 62.64%, the English Essays dataset has an average accuracy of 70.16%, and the Chinese Essays dataset has an average accuracy of 70.08%, 65.46% average accuracy in English FriendsPersona, 62.57% average accuracy in Chinese FriendsPersona, 64.79% average accuracy in English Fusion dataset, and 65.46% average accuracy in Chinese Fusion dataset.
致謝詞 i
中文摘要 iii
Abstract iv
目錄 vi
圖目錄 ix
表目錄 x
第一章 緒論 1
1.1 前言 1
1.2 研究背景與動機 1
1.3 研究目的 2
1.4 論文架構 2
2 第二章 文獻探討 3
2.1 聊天機器人技術(Chatbot Technology) 4
2.1.1 檢索式聊天機器人技術(Retrieval Chatbot Technology) 4
2.1.2 聊天機器人導入個性(Chatbot with Personality Traits) 4
2.1.3 聊天機器人的評估(Evaluation of Chatbot) 5
2.1.4 聊天機器人的評估資料集(Evaluation Dataset of Chatbot) 5
2.2 注意力機制(Attention Mechanism) 6
2.2.1 自注意力(Self-Attention) 6
2.2.2 超長變形金剛(Transformer-XL) 6
2.2.3 開關變形金剛(Switch Transformer) 7
2.2.4 精緻高速網路(Refined Highway) 8
2.3 XLNet 8
2.4 膠囊網路(Capsule Network) 9
2.5 自動人格特質辨識 10
2.5.1 人格特質指標 10
2.5.2 自動人格特質辨識資料集Dataset 10
2.5.3 自動人格特質辨識 10
3 第三章 研究方法 12
3.1 探討中文與英文的表現差異 12
3.2 添加停用詞、標點符號、語助詞特徵進行辨識 13
3.3 架構1 - XLNet Self-Attention Caps 14
3.4 架構2 - XLNet Refined Highway Caps 15
3.5 架構3 - XLNet Switch Refined Highway Caps 16
3.6 有個性之聊天機器人 17
4 第四章 實驗結果 20
4.1 實驗環境 20
4.2 詞相似度、詞類比、文件分類資料配置 21
4.3 自動人格特質辨識訓練資料配置與驗證標準 21
4.4 聊天機器人主觀評估 21
4.5 自動人格特質辨識模型訓練超參數設定 21
4.6 提出方法之實驗結果驗證與比較 22
4.6.1 探討中文與英文的表現 22
4.6.2 實驗加入停用詞、標點符號、語助詞特徵之效果 23
4.6.3 自動人格特質辨識與相關文獻比較 25
4.6.4 聊天機器人檢索系統與相關文獻比較 28
4.6.5 聊天機器人個性匹配比較 29
4.7 模型改進分析與實驗 29
4.7.1 精緻高速公路(Refined Highway)層數實驗 29
4.7.2 Refined Highway與Switching layer連接實驗 31
4.7.3 去除膠囊網路(capsule network) 34
5 第五章 結論與未來展望 37
6 參考資料 38


[1]J. Ramos, "Using tf-idf to determine word relevance in document queries," in Proceedings of the first instructional conference on machine learning, 2003, vol. 242, no. 1: Citeseer, pp. 29-48.
[2]J. W. Pennebaker, M. E. Francis, and R. J. Booth, "Linguistic inquiry and word count: LIWC 2001," Mahway: Lawrence Erlbaum Associates, vol. 71, no. 2001, p. 2001, 2001.
[3]T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
[4]B. Mehra, "Chatbot personality preferences in Global South urban English speakers," Social Sciences & Humanities Open, vol. 3, no. 1, p. 100131, 2021.
[5]L. Zhou, J. Gao, D. Li, and H.-Y. Shum, "The design and implementation of xiaoice, an empathetic social chatbot," Computational Linguistics, vol. 46, no. 1, pp. 53-93, 2020.
[6]T. Lee, K. Park, J. Park, Y. Jeong, J. Chae, and H. Lim, "Korean Q&A Chatbot for COVID-19 News Domains Using Machine Reading Comprehension," in Annual Conference on Human and Language Technology, 2020: Human and Language Technology, pp. 540-542.
[7]A. S. Lokman and M. A. Ameedeen, "Modern chatbot systems: A technical review," in Proceedings of the future technologies conference, 2018: Springer, pp. 1012-1023.
[8]T. Zhao, X. Lu, and K. Lee, "Sparta: Efficient open-domain question answering via sparse transformer matching retrieval," arXiv preprint arXiv:2009.13013, 2020.
[9]R. Nogueira, W. Yang, J. Lin, and K. Cho, "Document expansion by query prediction," arXiv preprint arXiv:1904.08375, 2019.
[10]L. Xiong et al., "Approximate nearest neighbor negative contrastive learning for dense text retrieval," arXiv preprint arXiv:2007.00808, 2020.
[11]K. C. Pramodh and Y. Vijayalata, "Automatic personality recognition of authors using big five factor model," in 2016 IEEE International Conference on Advances in Computer Applications (ICACA), 2016: IEEE, pp. 32-37.
[12]H. Nguyen, D. Morales, and T. Chin, "A neural chatbot with personality," Published at the Semantic Scholar.
[13]Y. Zheng, G. Chen, M. Huang, S. Liu, and X. Zhu, "Personalized dialogue generation with diversified traits," arXiv preprint arXiv:1901.09672, 2019.
[14]C. Distinguishability, "A Theoretical Analysis of Normalized Discounted Cumulative Gain (NDCG) Ranking Measures," 2013.
[15]E. Voorhees et al., "TREC-COVID: constructing a pandemic information retrieval test collection," in ACM SIGIR Forum, 2021, vol. 54, no. 1: ACM New York, NY, USA, pp. 1-12.
[16]G. Tsatsaronis et al., "An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition," BMC bioinformatics, vol. 16, no. 1, pp. 1-28, 2015.
[17]V. Boteva, D. Gholipour, A. Sokolov, and S. Riezler, "A full-text learning to rank dataset for medical information retrieval," in European Conference on Information Retrieval, 2016: Springer, pp. 716-722.
[18]T. Kwiatkowski et al., "Natural questions: a benchmark for question answering research," Transactions of the Association for Computational Linguistics, vol. 7, pp. 453-466, 2019.
[19]Z. Yang et al., "HotpotQA: A dataset for diverse, explainable multi-hop question answering," arXiv preprint arXiv:1809.09600, 2018.
[20]M. Maia et al., "Www'18 open challenge: financial opinion mining and question answering," in Companion Proceedings of the The Web Conference 2018, 2018, pp. 1941-1942.
[21]A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
[22]Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, and R. Salakhutdinov, "Transformer-xl: Attentive language models beyond a fixed-length context," arXiv preprint arXiv:1901.02860, 2019.
[23]W. Fedus, B. Zoph, and N. Shazeer, "Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity," ed, 2021.
[24]N. Shazeer et al., "Outrageously large neural networks: The sparsely-gated mixture-of-experts layer," arXiv preprint arXiv:1701.06538, 2017.
[25]M. Kim, T. Kim, and D. Kim, "Spatio-temporal slowfast self-attention network for action recognition," in 2020 IEEE International Conference on Image Processing (ICIP), 2020: IEEE, pp. 2206-2210.
[26]A. Dosovitskiy et al., "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.
[27]R. Girdhar, J. Carreira, C. Doersch, and A. Zisserman, "Video action transformer network," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 244-253.
[28]M.-H. Ha and O. T.-C. Chen, "Deep Neural Networks Using Residual Fast-Slow Refined Highway and Global Atomic Spatial Attention for Action Recognition and Detection," IEEE Access, vol. 9, pp. 164887-164902, 2021.
[29]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
[30]Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, "Xlnet: Generalized autoregressive pretraining for language understanding," Advances in neural information processing systems, vol. 32, 2019.
[31]S. Sabour, N. Frosst, and G. E. Hinton, "Dynamic routing between capsules," Advances in neural information processing systems, vol. 30, 2017.
[32]D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," nature, vol. 323, no. 6088, pp. 533-536, 1986.
[33]P. Costa, R. R. McCrae, and N. Revised, "Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI): Professional Manual," Psychological Assessment Resources, Odessa, FL, 1992.
[34]F. Celli, F. Pianesi, D. Stillwell, and M. Kosinski, "Workshop on computational personality recognition: Shared task," in Proceedings of the International AAAI Conference on Web and Social Media, 2013, vol. 7, no. 2, pp. 2-5.
[35]N. Majumder, S. Poria, A. Gelbukh, and E. Cambria, "Deep learning-based document modeling for personality detection from text," IEEE Intelligent Systems, vol. 32, no. 2, pp. 74-79, 2017.
[36]H. Jiang, X. Zhang, and J. D. Choi, "Automatic text-based personality recognition on monologues and multiparty dialogues using attentive networks and contextual embeddings (student abstract)," in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 10, pp. 13821-13822.
[37]D. Ruta and B. Gabrys, "Classifier selection for majority voting," Information fusion, vol. 6, no. 1, pp. 63-81, 2005.
[38]E. P. Tighe, J. C. Ureta, B. A. L. Pollo, C. K. Cheng, and R. de Dios Bulos, "Personality Trait Classification of Essays with the Application of Feature Reduction," in SAAIP@ IJCAI, 2016, pp. 22-28.
[39]W. Yin, H. Schütze, B. Xiang, and B. Zhou, "Abcnn: Attention-based convolutional neural network for modeling sentence pairs," Transactions of the Association for Computational Linguistics, vol. 4, pp. 259-272, 2016.
[40]P. Zhou et al., "Attention-based bidirectional long short-term memory networks for relation classification," in Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers), 2016, pp. 207-212.
[41]X. Wang et al., "Heterogeneous graph attention network," in The world wide web conference, 2019, pp. 2022-2032.
[42]Y. Liu et al., "Roberta: A robustly optimized bert pretraining approach," arXiv preprint arXiv:1907.11692, 2019.
[43]Y. Wang, J. Zheng, Q. Li, C. Wang, H. Zhang, and J. Gong, "XLNet-caps: personality classification from textual posts," Electronics, vol. 10, no. 11, p. 1360, 2021.
[44]A. Roshchina, J. Cardiff, and P. Rosso, "A comparative evaluation of personality estimation algorithms for the twin recommender system," in Proceedings of the 3rd international workshop on Search and mining user-generated contents, 2011, pp. 11-18.
[45]J. A. Qadir, A. K. Al-Talabani, and H. A. Aziz, "Isolated Spoken Word Recognition Using One-Dimensional Convolutional Neural Network," International Journal of Fuzzy Logic and Intelligent Systems, vol. 20, no. 4, pp. 272-277, 2020.
[46]X. Jiao et al., "Tinybert: Distilling bert for natural language understanding," arXiv preprint arXiv:1909.10351, 2019.
[47]Y. Wang, A. Sun, J. Han, Y. Liu, and X. Zhu, "Sentiment analysis by capsules," in Proceedings of the 2018 world wide web conference, 2018, pp. 1165-1174.
[48]X. Zhang, P. Wu, J. Cai, and K. Wang, "A contrastive study of Chinese text segmentation tools in marketing notification texts," in Journal of Physics: Conference Series, 2019, vol. 1302, no. 2: IOP Publishing, p. 022010.
[49]https://pypi.org/project/ckip-transformers/
[50]C.-Y. Chen and W.-Y. Ma, "Word embedding evaluation datasets and wikipedia title embedding for Chinese," in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018.
[51]C.-Y. Chen and W.-Y. Ma, "Embedding wikipedia title based on its wikipedia text and categories," in 2017 International Conference on Asian Language Processing (IALP), 2017: IEEE, pp. 146-149.
[52]https://dumps.wikimedia.org/zhwiki/
[53]https://term.ptt.cc/
[54]S. Temma, M. Sugii, and H. Matsuno, "The document similarity index based on the Jaccard distance for mail filtering," in 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), 2019: IEEE, pp. 1-4.

電子全文 電子全文(網際網路公開日期:20250901)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊