跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.85) 您好!臺灣時間:2024/12/12 10:39
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林祖安
研究生(外文):Tsu-An Lin
論文名稱:以深度學習為基礎的自然語言交談系統:以金融領域為例
論文名稱(外文):A Deep Learning Based Natural Language Conversation System: Studies on Financial Datasets
指導教授:李偉柏
指導教授(外文):Lee Wei-Po
學位類別:碩士
校院名稱:國立中山大學
系所名稱:資訊管理學系研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:64
中文關鍵詞:長短期記憶循環神經網路問答系統交談系統深度學習自然語言
外文關鍵詞:Long Short-Term Memory (LSTM)Answering systemRecurrent Neural Network (RNN)Natural Language Processing (NLP)Conversation systemDeep learning
相關次數:
  • 被引用被引用:1
  • 點閱點閱:1569
  • 評分評分:
  • 下載下載:18
  • 收藏至我的研究室書目清單書目收藏:1
近年因為深度學習的發展對資料預測、分類、辨識都取得進步,循環神經網路中的長短期記憶也被證明在自然語言處理的工作中有良好的成果,本研究希望利用深度學習模型建立自然語言交談系統,用問答資料訓練模型,使系統可以跟使用者對話、解決問題,並使用推理和擴充知識庫的方式讓系統成長。
本研究利用word2vec、深度學習模型、外部知識庫來建構一個完整的自然語言交談系統,從StackExchange服務中搜集金融問答集,將文字資料處理成數字向量後,使用word2vec訓練成關係網路嵌入至深度學習模型中,讓模型可以更好的訓練預測模型。使用DBpedia作為外部知識庫則可以讓系統回答原本不存在知識庫中的問題,進一步擴充知識庫,讓系統可以藉由交談的過程獲得更多知識。
在研究過程中探討了模型訓練採用不同錯誤答案的選擇策略、不同的問答資料、不同的模型結構與資料覆蓋度都對模型預測能力有相當的影響,訓練過程中接觸越多的錯誤答案使模型可以有更好的預測能力,外部知識庫也發現使用上的一些限制,符合的情形下則可以順利回答問題並擴充知識庫,建立問題相似模型可以成功實作推理功能,讓系統接收問題後可以先在知識庫中找到相似的問題在進而尋找答案。
Over the last few years, many studies have shown that the performance of data prediction, classification and identification can be largely improved using the deep learning model. Recurrent neural networks (RNNs) and long short-term memory (LSTM) models have also been confirmed to have good results in natural language processing. Following the success of these works, this study attempts to use the deep learning model to develop a natural language conversation system. The system adopts a question-answering data training model to perform conversations with user, solves the corresponding problems, and uses reasoning techniques and expands the knowledge base to enrich the human-machine dialogue.
In this thesis, I employed several computational methods, including word2vec, deep learning model, and external knowledge transfer to construct a complete natural language conversation system. In addition, I conducted extensive sets of experiments with different strategies of choosing wrong answers in model training, organizing different question-answer data, testing different model structures and analyzing the factor of data coverage to investigate their considerable influences on the model predictive ability. From the results, I found that the more wrong answers used in the training process, the better predictive ability of a model can be obtained. Also, I found that an external knowledge base was beneficial to the system but there were some restrictions to meet. Extra efforts were required to meet these restrictions and the knowledge base can thus be expanded successfully. Additionally, by creating a model of similar questions, the system can perform the reasoning function to alternatively look for a similar question and retrieve its corresponding answer from the knowledge based in response to the user’s query.
目錄
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
第一章 緒論 1
1.1. 研究背景 1
1.2. 研究動機與目的 2
第二章 文獻探討 4
2.1. 問答系統(Question Answering Systems) 4
2.2. 深度學習(Deep Learning) 5
2.2.1. 循環神經網路(Recurrent neural networks, RNNs) 6
2.2.2. 長短期記憶(long short-term memory, LSTM) 6
2.3. 基於深度學習的問答系統 7
第三章 研究方法與步驟 8
3.1. 系統架構 8
3.2. 自然語言處理與問答模型建立 9
3.2.1. 深度學習模型 9
3.2.2. 研究流程及進行步驟 10
3.3. 推理與擴充知識庫 14
3.3.1. 推理 14
3.3.2. 擴充知識庫 15
3.4. 對話處理 16
第四章 研究結果 19
4.1資料整理與文字處理 19
4.1.1.金融問答集 19
4.1.2保險問答集 21
4.1.3相似問題集 22
4.2訓練資料洗牌 23
4.3錯誤答案策略影響探討 24
4.3.1.金融資料-Embedding模型結果 24
4.3.2.金融資料-ConvolutionalLSTM模型結果 27
4.3.3每個策略錯誤問題覆蓋率 31
4.3.4保險資料-ConvolutionalLSTM模型結果 32
4.4覆蓋度問題探討 35
4.4.1.Embedding模型結果 36
4.4.2.ConvolutionalLSTM模型結果 38
4.5外部知識庫:DBpedia 41
4.6推理:問題相似度模型 42
4.7 綜合討論 44
第五章 結論與未來研究 46
5.1 結論 46
5.2未來研究 46
參考文獻 48
參考文獻
Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127.
Bengio, Y., Courville, A., & Vincent, P. (2012). Representation Learning: A Review and New Perspectives. arXiv:1206.5538 [Cs]. Retrieved from http://arxiv.org/abs/1206.5538
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166.
Bordes, A., Chopra, S., & Weston, J. (2014). Question Answering with Subgraph Embeddings. arXiv:1406.3676 [Cs]. Retrieved from http://arxiv.org/abs/1406.3676
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res., 12, 2493–2537.
Deng, L. (2014). Deep Learning: Methods and Applications. Foundations and Trends® in Signal Processing, 7(3–4), 197–387.
Graves, A. (2013). Generating Sequences With Recurrent Neural Networks. arXiv:1308.0850 [Cs]. Retrieved from http://arxiv.org/abs/1308.0850
Graves, A., & Jaitly, N. (2014). Towards End-To-End Speech Recognition with Recurrent Neural Networks (pp. 1764–1772). Presented at the Proceedings of the 31st International Conference on Machine Learning (ICML-14). Retrieved from http://machinelearning.wustl.edu/mlpapers/papers/icml2014c2_graves14
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., & Schmidhuber, J. (2009). A Novel Connectionist System for Unconstrained Handwriting Recognition. IEEE Trans. Pattern Anal. Mach. Intell., 31(5), 855–868.
Graves, A., Mohamed, A., & Hinton, G. (2013). Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 [Cs]. Retrieved from http://arxiv.org/abs/1303.5778
Green, B. F., Jr., Wolf, A. K., Chomsky, C., & Laughery, K. (1961). Baseball: An Automatic Question-answerer. In Papers Presented at the May 9-11, 1961, Western Joint IRE-AIEE-ACM Computer Conference (pp. 219–224). New York, NY, USA: ACM.
Green, B., Wolf, A., Chomsky, C., & Laughery, K. (1986). Readings in Natural Language Processing. In B. J. Grosz, K. Sparck-Jones, & B. L. Webber (Eds.) (pp. 545–549). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=21922.24354
Han, L., Yu, Z.-T., Qiu, Y.-X., Meng, X.-Y., Guo, J.-Y., & Si, S.-T. (2008). Research on passage retrieval using domain knowledge in Chinese question answering system. In 2008 International Conference on Machine Learning and Cybernetics (Vol. 5, pp. 2603–2606).
Hao, X., Chang, X., & Liu, K. (2007). A Rule-based Chinese Question Answering System for Reading Comprehension Tests. In Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007 (Vol. 2, pp. 325–329).
Hihi, S. E., & Bengio, Y. (1996). Hierarchical Recurrent Neural Networks for Long-Term Dependencies. In D. S. Touretzky & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 493–499). MIT Press. Retrieved from http://papers.nips.cc/paper/1102-hierarchical-recurrent-neural-networks-for-long-term-dependencies.pdf
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780.
Hu, B., Lu, Z., Li, H., & Chen, Q. (2014). Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 2042–2050). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/5550-convolutional-neural-network-architectures-for-matching-natural-language-sentences.pdf
Huang, J., Zhou, M., & Yang, D. (2007). Extracting Chatbot Knowledge from Online Discussion Forums. In Proceedings of the 20th International Joint Conference on Artifical Intelligence (pp. 423–428). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=1625275.1625342
Ittycheriah, A., Franz, M., Zhu, W., Ratnaparkhi, A., & Mammone, R. J. (2001). IBM’s Statistical Question Answering System. ResearchGate. Retrieved from https://www.researchgate.net/publication/2875435_IBM’s_Statistical_Question_Answering_System
Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2014). On Using Very Large Target Vocabulary for Neural Machine Translation. arXiv:1412.2007 [Cs]. Retrieved from http://arxiv.org/abs/1412.2007
Kiros, R., Salakhutdinov, R., & Zemel, R. S. (2014). Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. arXiv:1411.2539 [Cs]. Retrieved from http://arxiv.org/abs/1411.2539
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Meng, Y., Rumshisky, A., & Romanov, A. (2017). Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture. arXiv:1703.05851 [Cs]. Retrieved from http://arxiv.org/abs/1703.05851
Mikolov, T., Deoras, A., Povey, D., Burget, L., & Černocký, J. (2011). Strategies for training large scale neural network language models. In 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 196–201).
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv:1310.4546 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1310.4546
Nguyen, M.-T., Phan, V.-A., Nguyen, T.-S., & Nguyen, M.-L. (2016). Learning to rank questions for community question answering with ranking svm. arXiv Preprint arXiv:1608.04185. Retrieved from https://arxiv.org/abs/1608.04185
Pascanu, R., Mikolov, T., & Bengio, Y. (2012). On the difficulty of training Recurrent Neural Networks. arXiv:1211.5063 [Cs]. Retrieved from http://arxiv.org/abs/1211.5063
Ravichandran, D., & Hovy, E. (2002). Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (pp. 41–47). Stroudsburg, PA, USA: Association for Computational Linguistics.
Riloff, E., & Thelen, M. (2000). A Rule-based Question Answering System for Reading Comprehension Tests. In Proceedings of the 2000 ANLP/NAACL Workshop on Reading Comprehension Tests As Evaluation for Computer-based Language Understanding Sytems - Volume 6 (pp. 13–19). Stroudsburg, PA, USA: Association for Computational Linguistics.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1. In D. E. Rumelhart, J. L. McClelland, & C. PDP Research Group (Eds.) (pp. 318–362). Cambridge, MA, USA: MIT Press. Retrieved from http://dl.acm.org/citation.cfm?id=104279.104293
Sainath, T. N., Mohamed, A.-R., Kingsbury, B., & Ramabhadran, B. (2013). Deep convolutional neural networks for LVCSR. In ResearchGate (pp. 8614–8618).
Schmidhuber, J. (2015). Deep Learning in Neural Networks: An Overview. Neural Networks, 61, 85–117.
Song, H. A., & Lee, S.-Y. (n.d.). Hierarchical Representation Using NMF. In SpringerLink (pp. 466–473). Springer Berlin Heidelberg.
SPARQL Query Language for RDF. (n.d.). Retrieved November 19, 2016, from https://www.w3.org/TR/rdf-sparql-query/
Sutskever, I. (2013). Training recurrent neural networks. University of Toronto. Retrieved from https://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdf
Sutskever, I., Martens, J., & Hinton, G. E. (2011). Generating Text with Recurrent Neural Networks. In ResearchGate (pp. 1017–1024). Retrieved from https://www.researchgate.net/publication/221345823_Generating_Text_with_Recurrent_Neural_Networks
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 3104–3112). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2014). Going Deeper with Convolutions. arXiv:1409.4842 [Cs]. Retrieved from http://arxiv.org/abs/1409.4842
Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.-C., Gerber, D., & Cimiano, P. (2012). Template-based Question Answering over RDF Data. In Proceedings of the 21st International Conference on World Wide Web (pp. 639–648). New York, NY, USA: ACM.
Wang, B., Liu, K., & Zhao, J. (2016). Inner attention based recurrent neural networks for answer selection. In The Annual Meeting of the Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/P/P16/P16-1122.pdf
Woods, W. A. (1973). Progress in Natural Language Understanding: An Application to Lunar Geology. In Proceedings of the June 4-8, 1973, National Computer Conference and Exposition (pp. 441–450). New York, NY, USA: ACM.
Yih, W., Chang, M.-W., Meek, C., Pastusiak, A., Yih, S. W., & Meek, C. (2013). Question Answering Using Enhanced Lexical Semantic Models. Microsoft Research. Retrieved from https://www.microsoft.com/en-us/research/publication/question-answering-using-enhanced-lexical-semantic-models/
Yu, L., Hermann, K. M., Blunsom, P., & Pulman, S. (2014). Deep Learning for Answer Sentence Selection. arXiv:1412.1632 [Cs]. Retrieved from http://arxiv.org/abs/1412.1632
Zhang, K., & Zhao, J. (2010). A Chinese question-answering system with question classification and answer clustering. In 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) (Vol. 6, pp. 2692–2696).
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊