臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.42) 您好！臺灣時間：2025/10/01 12:46

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

游哲誠

研究生(外文):

Yu Che-Cheng

論文名稱:

使用深度學習Seq2seq方法處理短文本對話生成

論文名稱(外文):

A Seq2seq Generation-Based Approach to Short Text Conversation

指導教授:

吳世弘

指導教授(外文):

Wu Shih-Hung

口試委員:

戴敏育、鄭文昌

口試委員(外文):

Day Min-Yuh、Cheng Wen-Chang

口試日期:

2017-07-12

學位類別:

碩士

校院名稱:

朝陽科技大學

系所名稱:

資訊工程系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2017

畢業學年度:

106

語文別:

中文

論文頁數:

中文關鍵詞:

對話系統、短文本對話生成、Seq2seq、NTCIR、回饋機制

外文關鍵詞:

Dialog System、short text conversation、Seq2seq、NTCIR、Feedback mechanisms

相關次數:

被引用:3
點閱:1675
評分:
下載:210
書目收藏:1

電腦自動對話系統現在越來越普遍，許多公司都在使用。但是現在的對話系統大多使用基於規則或是檢索的方式來產生回覆，這種回答的內容皆是用預定的回覆庫，所以回答較無多樣性。
近幾年來有許多人研究以生成(generation)的方式產生回覆句，本篇研究使用深度學習技術序列對序列(Seq2seq)來解決短文本對話(Short text conversation, STC)生成的問題。並且參與了NTCIR-13中STC-2議題的對話生成(Generation-based)任務。我們使用了NTCIR主辦單位所給予的簡體中文資料集。由於多數資料是沒有整理成一個輸入對應一個輸出，本篇研究先使用檢索方式，建立大量輸入句與對應句的句子對來做訓練集。此外，基本的Seq2seq所產生的回覆是固定的一句，無法產生不同的句子。因此我們加入一套回饋機制，從前一句的回覆中抓取資訊加入到輸入中，以此方式來產生不同的回覆。實驗中使用了TensorFlow的LSTM和GRU兩種單位元，並且比較兩種方式的收斂速度以及實驗結果。

Dialogue systems are quite common and used by many companies. But most of the dialogue system use rule-based or retrieve-based approach to reply users. The replies are predefined with less diversity. In recent years, many studies try to build systems that can generate responses. In this study, we use advanced learning technology Seq2seq to build such a system that can generate a short dialog (Short text conversation, STC). We participate the NTCIR-13 on STC-2 Generation-based sub-task. Organizers provided simplified Chinese data set. But most of the data is not sorted into a post-to-comment corresponding pairs. Therefore, we firstly use retrieval methods access to build a training sets. Then the Seq2seq method is used to generate responses. The original seq2seq model can generate only one sentence, cannot produce a different sentence, so we add a feedback mechanism that our system extracts the information in the generated response and adds to the input, such that seq2seq model can produce a different response. We conduct experiments using the LSTM and the GRU units in TensorFlow, and compare the convergence speeds and results.

目錄
摘要 I
Abstract II
致謝 III
目錄 IV
表目錄 VII
圖目錄 VIII
第一章緒論 1
第二章知識背景 4
2.1 中文斷詞處理-結巴中文分詞(Jieba) 4
2.2 遞迴神經網路 6
2.3 長短期記憶 7
2.4 門閘遞迴單元 10
2.5 序列對序列Sequence to Sequence 12
第三章研究方法 15
3.1 系統架構 15
3.2 訓練集 16
3.3 前處理與訓練模型 18
3.4 產生回覆 20
第四章實驗與結果 23
4.1實驗方式 23
4.2 評估方式 24
4.2.1 nG@1 24
4.2.2 Mean nG@1 24
4.3 實驗結果 24
4.3.1 實驗一抽取名詞、動詞、形容詞(NVA)回饋詞 24
4.3.2 實驗二抽取名詞、動詞(NV)回饋詞 26
4.3.3 實驗三抽取名詞、形容詞(NA)回饋詞 27
4.3.4 實驗四抽取動詞、形容詞(VA)回饋詞 28
4.3.5 實驗五抽取動詞(V)回饋詞 29
4.3.6 實驗六抽取名詞(N)回饋詞 30
4.3.7 實驗七抽取形容詞(A)回饋詞 31
4.3.8 實驗八比較GRU、LSTM以及多層的差別 32
4.3.9 實驗九詞性數量整理 38
4.4 NTCIR比賽結果評估分數 39
第五章結論與未來工作 40
參考文獻 41
附錄1 Jieba所使用到的詞性 43
附錄2 實驗八所評估的30句Post 44
附錄3 NTCIR-13 STC2的100句輸入句 45
附錄4 NTCIR-13 STC2所評估100句的nG@1 48
附錄5 評分較高的例子 51
表目錄
表 1短文本對話範例 3
表 2詞性對照表 5
表 3利用檢索方式整理的訓練集 17
表 4抽取詞性加進輸入的結果 22
表 5 Model的設定 23
表 6 M1、M2、M3使用名詞、動詞、形容詞(NVA)的結果 25
表 7 M2、M3使用名詞、動詞(NV)的結果 26
表 8 M2、M3使用名詞、形容詞(NA)的結果 28
表 9 M2、M3使用動詞、形容詞(VA)的結果 28
表 10 M2、M3使用動詞(V)的結果 29
表 11 M2、M3使用名詞(N)的結果 30
表 12 M2、M3使用形容詞(A)的結果 31
表 13各種方式的產生結果 35
表 14各實驗的詞性數量 38
圖目錄
圖 1 Jieba斷詞的結果 4
圖 2 加入字典後的Jieba斷詞的結果 5
圖 3 Jieba詞性標記的結果 5
圖 4 RNN結構圖 6
圖 5 RNN結構展開圖 6
圖 6 基本RNN結構圖 8
圖 7 LSTM結構圖 8
圖 8 LSTM分析圖 10
圖 9 GRU基本結構圖 10
圖 10 由[15]提出的GRU結構圖 11
圖 11 LSTM模型，ABC為輸入，WXYZ為輸出，每個框皆代表RNN 13
圖 12 LSTM和注意機制的多層Seq2seq 14
圖 13 系統架構圖 15
圖 14 檢索系統的流程圖 17
圖 15 產生回覆的流程圖 22
圖 16 LSTM Layer=1 收斂速度 32
圖 17 LSTM Layer=2 收斂速度 33
圖 18 LSTM Layer=3 收斂速度 33
圖 19 GRU Layer=1 收斂速度 34
圖 20 GRU Layer=2 收斂速度 34
圖 21 GRU Layer=3 收斂速度 35

[1] Isbell, C. L., et al. “Cobot in LambdaMOO: A social statistics agent.”, AAAI/IAAI. 2000. p. 36-41.
[2] WANG, H., et al. “A Dataset for Research on Short-Text Conversations.”, EMNLP. 2013. p. 935-945.
[3] Vinyals, O., & Le, Q. “A neural conversational model.”, arXiv preprint arXiv:1506.05869, 2015.
[4] Shang, L., Lu, Z., and Li, H. “Neural responding machine for short-text conversation.”, arXiv preprint arXiv:1503.02364, 2015.
[5] Serban, I. V., et al. “Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.”, AAAI. 2016. p. 3776-3784.
[6] Li, J, et al. “A diversity-promoting objective function for neural conversation models.”, arXiv preprint arXiv:1510.03055, 2015.
[7] Jieba. [online]
Available at: https://github.com/fxsjy/jieba [Accessed 10 JAN. 2017].
[8] Understanding LSTM Networks. [online]
Available at: http://colah.github.io/posts/2015-08-Understanding-LSTMs
[Accessed 12 MAR. 2017].
[9] Williams, R. J., and Zipser, D. “Gradient-based learning algorithms for recurrent networks and their computational complexity.”, Theory, architectures, and applications, 1995, 1: 433-486.
[10] Werbos, P. J. “Generalization of backpropagation with application to a recurrent gas market model.”, Neural networks, 1988, 1.4: 339-356.
[11] Robinson, A. J. and Fallside, F., “The utility driven dynamic error propagation network”, Cambridge University Engineering Department, 1987.
[12] Hochreiter, S., “Untersuchungen zu dynamischen neuronalen Netzen.“, Diploma thesis, Institut fur Informatik, Lehrstuhl Prof, Brauer, Technische Universitat Munchen, 1991.
[13] Hochreiter, S., and Schmidhuber, J. “Long short-term memory.”, Neural computation, 1997, 9.8: 1735-1780.
[14] Bengio, Y., Simard, P., and Frasconi, P. “Learning long-term dependencies with gradient descent is difficult.”, IEEE transactions on neural networks, 1994, 5.2: 157-166.
[15] Cho, K., et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation.”, arXiv preprint arXiv:1406.1078, 2014.
[16] Jozefowicz, R., Zaremba, W., and Sutskever, I.. “An empirical exploration of recurrent network architectures.", Proceedings of the 32nd International Conference on Machine Learning (ICML-15). 2015. p. 2342-2350.
[17] Sutskever, I., Vinyals, O., and Le, Q. V. “Sequence to sequence learning with neural networks.”, Advances in neural information processing systems. 2014. p. 3104-3112.
[18] Bahdanau, D., Cho, K., and Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
[19] Wu, S. H., et al. “CYUT Short Text Conversation System for NTCIR-12 STC.”, NTCIR. 2016.
[20] Sequence to sequence Models. [online]
Available at: https://www.tensorflow.org/tutorials/seq2seq
[Accessed 22 FEB. 2017]
[21] Apache Lucene. [online]
Available at: http://lucene.apache.org [Accessed 17 APR. 2017]

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	利用深度學習演算法分析商品描述自動擴增電子商務產品階層
2.	基於深度學習預測交通意外及事故
3.	基於深度學習之音樂片段人聲辨識
4.	應用深度學習于文本多標籤分類之研究
5.	基於深度學習從手部視角辨識影像
6.	基於深度學習之心律不整輔助診斷系統
7.	基於深度學習之復健動作辨識系統
8.	深度學習用於愛因斯坦棋研發之初步探討
9.	協同過濾深度學習之推薦系統
10.	基於深度學習之雜訊分類的圖像復原方法
11.	基於深度學習之戶外導航機器人
12.	基於深度學習之3D視訊品質評估技術
13.	使用深度學習提升蒙地卡羅樹搜尋之效能：以電腦版塊圍棋為例
14.	深度學習應用於以影像辨識為基礎的個人化推薦系統-以服飾樣式為例
15.	深度學習網路之研究及其應用

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室