(3.237.20.246) 您好!臺灣時間:2021/04/15 09:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:黃晧誠
研究生(外文):Hao-Cheng Huang
論文名稱:中文筆順預訓練效能之研究
指導教授:林熙禎林熙禎引用關係
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:73
中文關鍵詞:預訓練表徵自然語言處理中文筆順
外文關鍵詞:Pre-trainingRepresentationNatural language processingChineseStroke
相關次數:
  • 被引用被引用:0
  • 點閱點閱:124
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
預訓練(Pre-training)在自然語言處理極為重要,然而中文在較新的自然語言處理 遷移學習研究較少,且多數是基於特徵及靜態嵌入方法之模型,因此本研究提出利用中 文更深層的特徵——筆順,納入輸入維度以學習子字元之特徵,並以近期提出基於特徵 方法 ELMo 及微調方法 BERT 的預訓練模型為基礎進行修改,試探討筆順對於中文預 訓練模型的影響,提出利用卷積類神經網路模型考量筆順特徵(Stroke)之 ELMo+S 及 BERT+S 模型。最後,使用下游任務 XNLI 及 LCQMC 資料集進行評估,結果顯示筆順 特徵對於這兩種預訓練模型並無明顯幫助。
Pre-training is extremely important in natural language processing. However, Chinese studies about transfer learning are less, and most of them are uesd features-based and static embedding methods. Therefore, this study proposes to use deeper features by Chinese- strokes, and integrates input dimensions to learn the characteristics of sub-characters based on the recent proposed pre-training model ELMO with feature-based method and BERT with fine-tuning method. We proposed the ELMo+S and BERT+S models which consider stroke features by the convolutional neural network. Finally, the results show that stroke features are not significantly helpful for these two pre-training models on the downstream task XNLI and LCQMC datasets.
摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vii
表目錄 ix
一、 緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 4
1-4 研究架構 5
二、 相關研究 6
2-1 特徵抽取模型 6
2-1-1 CNN 6
2-1-2 LSTM 10
2-1-3 Transformer 14
2-2 預訓練 18
2-2-1 基於特徵 18
2-2-2 微調 20
2-3 中文 23
2-3-1 基於特徵 23
2-3-2 表徵 24
2-4 小結 25
三、 研究方法 26
3-1 研究架構 26
3-2 資料前處理 27
3-2-1 簡繁轉換 27
3-2-2 筆順 27
3-3 預訓練模型 28
3-3-1 ELMo+S 28
3-3-2 BERT+S 30
3-4 下游任務模型 30
3-4-1 ELMo 下游模型 30
3-4-2 BERT 下游模型 32
3-5 模型評估 33
四、 實驗與結果 34
4-1 前處理與資料集 34
4-1-1 筆順對應表 34
4-1-2 詞彙表 35
4-1-3 預訓練外部語料庫 35
4-1-4 下游任務資料集 36
4-1-5 各資料集之筆順長度 38
4-2 實驗環境 43
4-3 實驗設計與結果 44
4-3-1 實驗一:簡繁與筆順長度對於模型之影響 44
4-3-2 實驗二:CNN 及卷積核大小對於模型之影響 47
4-3-3 實驗三:高速網路對於模型之影響 48
4-3-4 實驗四:筆順對於預訓練模型之影響 49
五、 結論與未來研究方向 52
5-1 結論 52
5-2 研究限制 53
5-3 未來研究方向 54
參考文獻 55
英文文獻 55
Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. ArXiv:1607.06450 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1607.06450
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135–146. https://doi.org/10.1162/tacl_a_00051
Bonaccorso, G., Fandango, A., & Shanmugamani, R. (2018). Python advanced guide to artificial intelligence: Expert machine learning systems and intelligent agents using Python.
Botha, J., & Blunsom, P. (2014). Compositional morphology for word representations and language modelling. International Conference on Machine Learning, 1899–1907.
Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., John, R. S., … Kurzweil, R. (2018). Universal Sentence Encoder. ArXiv:1803.11175 [Cs]. Retrieved from http://arxiv.org/abs/1803.11175
Chen, X., Xu, L., Liu, Z., Sun, M., & Luan, H. (2015). Joint Learning of Character and Word Embeddings. Twenty-Fourth International Joint Conference on Artificial Intelligence, 1236–1242. IJCAI.
Collobert, R., & Weston, J. (2008). A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. Proceedings of the 25th International Conference on Machine Learning. ACM, 8.
Conneau, A., & Kiela, D. (2018). SentEval: An Evaluation Toolkit for Universal Sentence Representations. ArXiv:1803.05449 [Cs]. Retrieved from http://arxiv.org/abs/1803.05449
Conneau, A., Lample, G., Rinott, R., Williams, A., Bowman, S. R., Schwenk, H., & Stoyanov, V. (2018). XNLI: Evaluating Cross-lingual Sentence Representations. ArXiv:1809.05053 [Cs]. Retrieved from http://arxiv.org/abs/1809.05053
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. Retrieved from http://arxiv.org/abs/1810.04805
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Howard, J., & Ruder, S. (2018). Universal Language Model Fine-tuning for Text Classification. ArXiv:1801.06146 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1801.06146
Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1746–1751. https://doi.org/10.3115/v1/D14-1181
Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R. S., Torralba, A., Urtasun, R., & Fidler, S. (2015). Skip-Thought Vectors. ArXiv:1506.06726 [Cs]. Retrieved from http://arxiv.org/abs/1506.06726
LeCun, Y. (1989). Generalization and network design strategies. In Connectionism in perspective (Vol. 19). Citeseer.
Li, Y., Li, W., Sun, F., & Li, S. (2015). Component-Enhanced Chinese Character Embeddings. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 829–834. https://doi.org/10.18653/v1/D15-1098
Liu, X., Chen, Q., Deng, C., Zeng, H., Chen, J., Li, D., & Tang, B. (2018). LCQMC:A Large-scale Chinese Question Matching Corpus. Proceedings of the 27th International Conference on Computational Linguistics, 1952–1962. Retrieved from http://www.aclweb.org/anthology/C18-1166
Luong, T., Socher, R., & Manning, C. (2013). Better word representations with recursive neural networks for morphology. Proceedings of the Seventeenth Conference on Computational Natural Language Learning, 104–113.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. ArXiv Preprint ArXiv:1301.3781.
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Retrieved from http://www.aclweb.org/anthology/D14-1162
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. ArXiv:1802.05365 [Cs]. Retrieved from http://arxiv.org/abs/1802.05365
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training. 12.
Rücklé, A., Eger, S., Peyrard, M., & Gurevych, I. (2018). Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations. ArXiv:1803.01400 [Cs]. Retrieved from http://arxiv.org/abs/1803.01400
Shaosheng Cao, J. Z., Wei Lu, & Li, X. (2018). cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information.
Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Highway Networks. ArXiv:1505.00387 [Cs]. Retrieved from http://arxiv.org/abs/1505.00387
Su, T., & Lee, H. (2017). Learning Chinese Word Representations From Glyphs Of Characters. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 264–273. https://doi.org/10.18653/v1/D17-1025
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 5998–6008). Retrieved from http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Williams, A., Nangia, N., & Bowman, S. (2018). A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1112–1122. https://doi.org/10.18653/v1/N18-1101
Wu, W., Meng, Y., Han, Q., Li, M., Li, X., Mei, J., … Li, J. (2019). Glyce: Glyph-vectors for Chinese Character Representations. ArXiv:1901.10125 [Cs]. Retrieved from http://arxiv.org/abs/1901.10125
Yin, R., Wang, Q., Li, P., Li, R., & Wang, B. (2016). Multi-Granularity Chinese Word Embedding. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 981–986. Retrieved from https://aclweb.org/anthology/D16-1100
Yu, J., Jian, X., Xin, H., & Song, Y. (2017). Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 286–291. Retrieved from https://www.aclweb.org/anthology/D17-1027
Yu, S., Kulkarni, N., Lee, H., & Kim, J. (2017). Syllable-level neural language model for agglutinative language. ArXiv Preprint ArXiv:1708.05515.
Zhuang, H., Wang, C., Li, C., Li, Y., Wang, Q., & Zhou, X. (2018). Chinese Language Processing Based on Stroke Representation and Multidimensional Representation. IEEE Access, 6, 41928–41941. https://doi.org/10.1109/ACCESS.2018.2860058
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔