臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.42) 您好！臺灣時間：2025/10/01 12:46

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

蔣宜衡

研究生(外文):

Yi-Heng Chiang

論文名稱:

應用序列到序列生成模型於雙向文本改寫之研究

論文名稱(外文):

Using the Sequence to Sequence Generative Model for Bidirectional Text Rewriting

指導教授:

魏世杰

口試委員:

蕭漢威、鄭啟斌、魏世杰

口試日期:

2018-06-02

學位類別:

碩士

校院名稱:

淡江大學

系所名稱:

資訊管理學系碩士班

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2018

畢業學年度:

106

語文別:

中文

論文頁數:

中文關鍵詞:

自然語言處理、神經機器翻譯、自然語言生成、深度學習、文本改寫

外文關鍵詞:

Natural Language Processing、Neural Machine Translation、Natural Language Generation、Deep Learning、Text Rewriting

相關次數:

被引用:1
點閱:200
評分:
下載:0
書目收藏:0

語言理解和掌握的能力固然因人而異，但同時也受到歷史變遷的影響。尤其是文言文作為過往的書面語，與一般現代人在日常生活中所使用的白話文存在著明顯的差異，因此現在很多人對於文言文會在理解能力上有所缺乏。
為了彌補文言文與白話文兩種書寫風格間的理解落差，本研究選擇以文言文與白話文的雙向文本改寫為主題，經由自然語言處理（Natural Language Processing）的方式進行語料處理，並且通過深度學習（Deep Learning）架構訓練 Seq2Seq 序列到序列模型，以生成對應書寫風格的語句。另外，本研究也以單語語料訓練文言文及白話文兩套獨立詞向量（Word Vector），來提取各書寫風格下內部詞語間的詞意關聯性。
本研究從文言文與白話文的對應關係著手，通過在兩者相應的平行語料提取彼此之間詞對應（Word Alignment）的關聯性，以此實作雙向神經機器翻譯（Neural Machine Translation）系統。最後，以 BLEU（Bilingual Evaluation Understudy）指標對於系統生成語句做評測。針對測試集的結果顯示，本系統於詞語層級所得到的BLEU得分中，白話文改寫文言文較佳；於字元層級所得到的BLEU得分中，則文言文改寫白話文較佳。而字元層級雙向文本改寫的BLEU得分都明顯勝過詞語層級的表現。
可看出本研究所採用的雙向文本改寫作法，已為導入自然語言技術，應用在理解白話文和文言文的中文書寫風格研究上，提供一個可供探索的方向。

Although the ability to understand and master a language varies from person to person, it is also affected by the evolution of the language itself. In particular, Classical Chinese as a written language of the past has obvious differences from Vernacular Chinese used in modern society. As a consequence, many Chinese today find it hard to understand Classical Chinese texts.
In order to bridge the gap in understanding the two writing styles of Classical Chinese and Vernacular Chinese, this work chooses the bidirectional text rewriting of Classical and Vernacular Chinese as the topic. A parallel corpus is collected and processed by natural language techniques. The corpus is used to train a sequence to sequence model under the deep learning architecture. The model can be used to generate sentences of the desired writing style. In addition, this work also uses two separate monolingual corpora to train two independent sets of word vectors in Classical Chinese and Vernacular Chinese, respectively. It aims to extract the semantic relevance between words in each writing style.
From the parallel corpus, this work tries to find the correspondence relations between Classical Chinese (CC) and Vernacular Chinese (VC). A neural machine translation model is applied to extract the relevant word alignments in the parallel corpus. As result, the BLEU metric is used to evaluate the generated sentences. For the test dataset, it is found that the word-level model can rewrite VC to CC better than CC to VC. In contrast, the character-level model can rewrite CC to VC better than VC to CC. Overall, the character-level model performs better than the word-level model in Chinese text rewriting.
In this work, natural language technologies are applied in rewriting between the two Chinese writing styles of Vernacular Chinese and Classical Chinese. It can be seen that the bidirectional text rewriting method used in this work has provided a promising study direction for understanding related writing styles.

目錄
第一章緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 3
1.4 論文架構 4
第二章文獻探討 5
2.1 自然語言處理與生成 5
2.2 傳統機器翻譯 6
2.2.1 統計式機器翻譯 7
2.2.2 規則式機器翻譯 8
2.2.3 其他類型的機器翻譯 9
2.3 神經機器翻譯 9
2.4 詞嵌入 13
第三章方法介紹 15
3.1 問題定義 15
3.2 系統架構 15
3.3 前處理 16
3.3.1 資料清理 17
3.3.2 繁簡轉換 17
3.3.3 斷詞 18
3.4 詞向量 19
3.4.1 單熱向量 20
3.4.2 Word2Vec 21
3.5 LSTM 23
3.6 Seq2Seq 23
3.7 BLEU 25
第四章實驗設計與結果 28
4.1 實驗環境 28
4.1.1 使用套件 28
4.1.2 語料來源 30
4.1.3 文本改寫語料 32
4.1.4 詞向量語料 32
4.2 實驗設計 33
4.2.1 文本改寫實驗 33
4.2.2 詞向量實驗 34
4.3 實驗結果 35
4.3.1 文本改寫結果 36
4.3.2 詞向量結果 40
4.4 文本改寫結果討論 40
第五章結語與未來發展 44
5.1 結論 44
5.2 研究限制 45
5.3 未來展望 46
參考文獻 47

表目錄
表1 訓練語料例句呈現表 19
表2 詞語層級訓練時 Seq2Seq 兩端序列內容例句呈現表 24
表3 執行環境列表 28
表4 套件與實作功能對應表 28
表5 語料列表 30
表6 詞語層級雙向神經機器翻譯訓練集 BLEU 評測得分表 37
表7 詞語層級雙向神經機器翻譯測試集 BLEU 評測得分表 38
表8 字元層級雙向神經機器翻譯訓練集 BLEU 評測得分表 38
表9 字元層級雙向神經機器翻譯測試集 BLEU 評測得分表 39
表10 測試集預例句呈現表 39
表11 訓練集預測例句呈現表 39
表12 文言詞向量相似詞排序呈現表 40
表13 白話文詞向量相似詞排序呈現表 40
表14 詞向量詞語配對餘弦相似度呈現表 41
表15 詞向量類推詞語關係呈現表 41

圖目錄
圖1 系統架構圖 15
圖2 單熱向量示意圖 21
圖3 Word2Vec 的兩種建立方式 CBOW 及 Skip-grams 說明圖 22
圖4 詞語層級訓練時 Seq2Seq 兩端序列內容例句呈現圖 24
圖5 Seq2Seq 文言到白話神經機器翻譯訓練過程圖 37
圖6 Seq2Seq 白話文到言神經機器翻譯訓練過程圖 37

[1] Artetxe, M., Labaka, G., Agirre, E., Cho, K. (2018) Unsupervised Neural Machine Translation. arXiv preprint arXiv:1710.11041.
[2] Badalamenti, A.F. (1991) Language and the Intuition of Meaning. Systems Research 8(4), pp. 43-66
[3] Bahdanau, D., Cho, K., Bengio, Y. (2015) Neural Machine Translation by Jointly Learning to Align and Translate. In: ICLR 2015.
[4] Banchs, R.E., D''Haro, L.F., Li, H. (2015) Adequacy–Fluency Metrics: Evaluating MT in the Continuous Space Model Framework. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 23(3), pp. 472–482
[5] Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C. (2003) A Neural Probabilistic Language Model. Journal of Machine. Learning Research 3:1137–1155.
[6] Bikel, D., & Zitouni, I. (2012). Multilingual Natural Language Processing Applications: From Theory to Practice. Indianapolis, IN: IBM Press.
[7] Bordes, A., Glorot, X., Weston, J., Bengio, Y. (2012) Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing. In Proceedings of AISTATS, pp. 127–135.
[8] Britz, D., Goldie, A., Luong, M.-T., Le, Q.V. (2017) Massive Exploration of Neural Machine Translation Architectures. CoRR abs/1703.03906.
[9] Brown, P.F., Cocke, J., Pietra, S.A.D., Pietra, C.J.D., Jelinek, F.,Lafferty, J.D., Mercer, R.L., Roossin, P.S. (1990) A Statistical Approach To Machine Translation. Computational linguistics 16 (2), 79-85.
[10]Brown, P.E., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L. (1993) The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2):263–311.
[11]Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y. (2014) Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In: Proceedings of EMNLP 2014.
[12]Chollet, F. (2017, September 29). A ten-minute introduction to sequence-to-sequence learning in Keras [Web blog content]. Retrieved from https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html
[13]Chung, J., Cho, K., Bengio, Y. (2016) A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation. CoRR abs/1603.06147
[14]Conneau, A., Schwenk, H., Barrault, L., Lecun, Y. (2017) Very Deep Convolutional Networks for Text Classification. arXiv preprint arXiv:1606.01781
[15]Devitt, M. (2006) Intuitions In Linguistics. The British Journal for the Philosophy of Science. 57(3), pp. 481–513
[16]Doddington, G. (2002) Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics. In Second International Conference on Human Language Technology Research, San Diego, CA. pp. 138–145.
[17]Fitzgerald, G.(2010) Linguistic Intuitions. The British Journal for the Philosophy of Science. 61(1), pp. 123–160
[18]Harris, Z.S. (1954) Distributional Structure. Word, 10(2-3), 146-162.
[19]Hinton, G.E. (1986) Learning distributed representations of concepts. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum.
[20]Hinton, G.E., Osindero, S., Teh, Y.-W. (2006) A fast learning algorithm for deep belief nets. Neural Computation, Volume 18, 2006, 2283-2292.
[21]Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R. (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
[22]Hochreiter, S., Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9(8), pp.1735–1780. oi:10.1162/neco.
[23]Huang, P.-S., Wang, C., Huang, S., Zhou, D., Deng, L. (2018) Towards Neural Phrase-Based Machine Translation.arXiv preprint arXiv:1706.05565
[24]Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2015) On Using Very Large Target Vocabulary for Neural Machine Translation. In ACL-IJCNLP 2015
[25]Kaiser, L., Gomez, A.N., Chollet, F. (2017) Depthwise Separable Convolutions for Neural Machine Translation. arXiv preprint arXiv:1706.03059
[26]Kaiser, L., Gomez, A.N., Shazeer, N., Vaswani, A., Parmar, N., Jones, L., Uszkoreit, J. (2017) One Model To Learn Them All. arXiv preprint arXiv:1706.05137
[27]Kim, Y., Jernite, Y., Sontag, D., Rush, A.M. (2015) Character-Aware Neural Language Models. arXiv preprint arXiv preprint arXiv:1508.06615
[28]Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., ...Herbst, E. (2007) Moses: Open Source Toolkit for Statistical Machine Translation. Annual Meeting of the Association for Computational Linguistics (ACL), demonstration session, Prague, Czech Republic, June 2007.
[29]Koehn, P., Och, F.J., Marcu, D. (2003) Statistical Phrase-Based Translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1. Stroudsburg, PA, USA: Association for Computational Linguistics. 2003. p. 48-54. (NAACL ''03).
[30]Koehn, P. (2004) Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models. Conference of the Association for Machine Translation in the Americas, 115-124.
[31]Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Neural Information Processing Systems 25.
[32]Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., Gulrajani, I., ...Socher, R. (2016) Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. arXiv preprint arXiv:1506.07285.
[33]Lample, G., Denoyer, L., Ranzato, M. (2017) Unsupervised Machine Translation Using Monolingual Corpora Only. arXiv preprint arXiv:1711.00043
[34]Lee, J., Cho, K., Hofmann, T. (2017) Fully Character-Level Neural Machine Translation without Explicit Segmentation. arXiv preprint arXiv:1610.03017.
[35]Le, Q.V., Mikolov, T. (2014) Distributed Representations of Sentences and Documents. arXiv preprint arXiv:1405.4053.
[36]Luong, M.-T., Pham, H., Manning, C.D. (2015) Effective Approaches to Attention-Based Neural Machine Translation. In Proceedings of the 2015 Conference on EMNLP.
[37]Luong, M.-T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W. (2014) Addressing the Rare Word Problem in Neural Machine Translation. arXiv preprint arXiv:1410.8206
[38]Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013) Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781.
[39]Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J. (2013) Distributed Representations of Words and Phrases and their Compositionality. arXiv preprint arXiv:1310.4546
[40]Nallapati, R., Zhou, B., Santos, C.N.d., Gulcehre, C., Xiang, B. (2016) Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. arXiv preprint arXiv:1602.06023
[41]Neubig, G. (2017) Neural Machine Translation and Sequence-to-sequence Models: A Tutorial. arXiv preprint arXiv:1703.01619
[42]Och, F.J., Ney, H. (2002) Discriminative Training and Maximum Entropy Models for Statistical Machine Translation. 40th Annual Meeting of the Association for Computational Linguistics, 295-302.
[43]Och, F.J. (2003) Minimum Error Rate Training in Statistical Machine Translation. In Proceedings of the 41st Annual Meeting of Association for Computational Linguistics, pages 160-167, Sapporo, Japan, July.
[44]Och, F.J., Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1), 19-51.
[45]Och, F.J., Ney, H. (2004) The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics, 30(4):417-449.
[46]Papineni, K., Roukos, S., Ward, T., Zhu, W.-J. (2002) BLEU: a Method for Automatic Evaluation of Machine Translation. In ACL 2002, pp. 311–318.
[47]Perera, R., Nand, P. (2017) Recent Advances in Natural Language Generation: A Survey and Classification of the Empirical Literature. Computing and Informatics, 36(1), 1–32. Retrieved from http://www.cai.sk/ojs/index.php/cai/article/viewArticle/2017_1_1
[48]Peter, P.F., Cocke, J., Della Pietra, S., Pietra, D., Della Pietra, V., Jelinek, F., Lafferty, J., Mercer, R.L., Roossin, P.S. (1990) A Statistical Approach To Machine Translation. Computational Linguistics 16(2):79-85
[49]Reynolds, A.C. (1954): The conference on mechanical translation held at M.I.T., June 17-20, 1952. Mechanical Translation 1 (3), 47-55.
[50]Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1986) Learning representations by back-propagation errors. Nature, 323(6088), 533-536.
[51]Rong, X. (2016) word2vec Parameter Learning Explained. arXiv preprint arXiv:1411.2738
[52]Saha, D., Bandyopadhyay, S. (2005) A Semantics-based English-Bengali EBMT System for translating News Headlines. In: Proceedings of the 10th Machine Translation Summit (Phuket, 12-16 December 2005), 125-133.
[53]Sennrich, R., Haddow, B., Birch, A. (2016) Neural Machine Translation of Rare Words with Subword Units. arXiv preprint arXiv:1508.07909
[54]Shannon, C.E. (1948) A Mathematical Theory of Communication. The Bell System Technical Journal, 27.
[55]Strassel, S., Przybocki, M., Peterson, K., Song, Z., Maeda, K. (2008) Linguistic Resources and Evaluation Techniques for Evaluation of Cross-Document Automatic Content Extraction. Proceedings of the 6th International Language Resources and Evaluation Conference (LREC-08), Marrakech, Morocco.
[56]Sutskever, I., Vinyals, O., Le, Q.V. (2014) Sequence to Sequence Learning with Neural Networks. arXiv preprint arXiv:1409.3215.
[57]Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.
[58]Taskar, B., Lacoste-Julien, S., Klein, D. (2005) A Discriminative Matching Approach to Word Alignment. In Proceedings of HLT/EMNLP 2005, 73-80, Vancouver, British Columbia, Canada, October.
[59]Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I. (2017) Attention Is All You Need. arXiv preprint arXiv:1706.03762v4
[60]Vinyals, O., Le, Q.V. (2015) A Neural Conversational Model. arXiv preprint arXiv:1506.05869
[61]Wu, Y., Schusterm, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., ...Dean, J. (2016) Google''s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv:1609.08144
[62]Yann, L., Bengio, Y., Hinton, G. (2015) Deep learning. Nature 521.7553 (2015): 436–444.
[63]Zaccone, G. (2016). Getting Started with TensorFlow. Birmingham, England: Packt Publishing Ltd.
[64]鄭捷(2017)。NLP漢語自然語言處理原理與實踐。北京市：電子工業出版社

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	假新聞的文字分析與辨識
2.	應用深度學習探索網路輿論與股價變動相關性之研究—以PTT為例
3.	利用深度嵌入向量模型的非監督式文字分群方法
4.	整合字元圖像與倉頡特徵的中文詞向量生成
5.	基於歌詞與歌曲音訊特徵之熱門歌曲預測
6.	類神經網路在行銷主軸與產品文案應用
7.	以不一致性損失函數結合抽取式和生成式摘要的融合摘要模型
8.	以雙向長短期記憶網路架構混和多時間粒度文字模態改善婚姻治療自動化行為評分系統
9.	MINION: 透過融合詞彙生成之語境情感特徵改進情緒偵測模型
10.	利用深度遷移學習處理跨語言文本分類問題
11.	利用深度學習之笑話辨識與生成
12.	基於注意力之英中對譯係統
13.	利用上下文感知最大化邊界神經網路提取疾病與疾病的關聯
14.	應用類神經網路擷取國道事件訊息屬性
15.	馬可夫遞迴神經網路於時序性深度學習之研究

無相關期刊

1.	應用BERT語言模型於同音別字之訂正
2.	人工智慧自動文本摘要研究
3.	應用腦電波研究虛擬實境技術對人大腦的影響
4.	結合標籤資訊偵測Instagram垃圾貼文之研究
5.	從神經資訊系統觀點探究廣告代言人與產品的價格、規格對購買意願之影響
6.	資安政策違反因素之探討
7.	基於卷積神經網路的論文自動生成技術
8.	情境模擬桌遊之設計與成效探究─以營養學為例
9.	310s不鏽鋼高溫鈉熱管之研製
10.	股利發放對公司價值的影響
11.	基於反覆學習控制之機械手臂的即時繪圖系統
12.	基於電磁感測技術之物體暨液體入侵告警系統
13.	貨幣同盟對實質經濟的影響
14.	影響羽球賽事參與意願因素之分析
15.	臺灣主題展策略經營研究

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室