跳到主要內容

臺灣博碩士論文加值系統

(44.211.117.197) 您好!臺灣時間:2024/05/21 03:54
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張瀞婷
研究生(外文):Ching-Ting Chang
論文名稱:以生成對抗網路自動產生中英文語碼轉換文句
論文名稱(外文):Chinese-English Code-switching Sentence Generation by Generative Adversarial Networks
指導教授:李琳山李琳山引用關係
口試委員:李宏毅蘇炫榮
口試日期:2019-01-28
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電信工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:74
中文關鍵詞:語碼轉換文本生成資料增強語言模型生成對抗網路
DOI:10.6342/NTU201900420
相關次數:
  • 被引用被引用:0
  • 點閱點閱:239
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
語碼轉換是指在一段話或是一段文字中交替使用兩種以上語言。基本上不同語者、不同對話內容、不同語言組合都可能有不同的語碼轉換風格與特性。雖然語碼轉換在自然語言中出現頻繁,但相較於單一語言,語碼轉換的語料相當缺乏。
本論文目標是發展一種非監督式的技術來自動產生語碼轉換的語料,並在兩套中文為主位語言、英文為客位語言的語碼轉換資料集上實驗驗證。本論文的方法是藉由生成對抗網路以及梯度策略演算法,從單一語言的文句 (主位語言) 預測適合的語碼轉換位置,將這些位置以詞翻譯為客位語言後產生句內語碼轉換的文句,並用以作為語言模型的增強訓練語料。結果顯示本論文所提出的方式能夠小幅度改善語言模型,並小幅降低語音辨識系統的客位語言的錯誤率。
誌謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
一、導論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 研究動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 相關研究 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 研究方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 章節安排 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
二、背景知識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 語碼轉換 (Code-switching, CS) . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 起源及定義 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 資料特性 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 機器學習 (Machine Learning, ML) . . . . . . . . . . . . . . . . . . . . 9
2.2.1 類神經網路 (Neural Networks, NN) . . . . . . . . . . . . . . . . 9
2.2.2 遞迴式類神經網路 (Recurrent Neural Networks, RNN) . . . . . 13
2.2.3 生成對抗網路 (Generative Adversarial Networks, GAN) . . . . 16
2.2.4 策略梯度 (Policy Gradient) 演算法 . . . . . . . . . . . . . . . . 17
2.3 語言模型及語音辨識系統 . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 語音辨識系統 (Automatic Speech Recognition, ASR) . . . . . . 19
2.3.2 語言模型 (Language Model, LM) . . . . . . . . . . . . . . . . . 20
2.3.3 聲學模型 (Acoustic Model, AM) . . . . . . . . . . . . . . . . . 23
2.3.4 音素集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.5 辭典 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.6 辨識解碼 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
三、中英文語碼轉換位置預測 . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 實驗語料 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 實驗模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4 評量方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.1 資訊檢索評估機制 . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.2 BLEU 値 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.3 詞錯誤率 (Word error rate, WER) . . . . . . . . . . . . . . . . . 38
3.5 實驗結果及分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.1 實驗基準 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.2 實驗數據 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.3 結果範例 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5.4 結果分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.6 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
四、中英文語碼轉換文本生成之應用 . . . . . . . . . . . . . . . . . . . . . . . 46
4.1 簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 實驗語料 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 生成文本之品質‒‒以混淆度評量 . . . . . . . . . . . . . . . . . . . . . 46
4.3.1 實驗模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.2 評量方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.3 實驗結果及分析 . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4 以生成文本訓練語言模型所獲得之混淆度 . . . . . . . . . . . . . . . 50
4.4.1 實驗模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4.2 評量方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4.3 實驗結果及分析 . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 以生成文本訓練語言模型用至語音辨識系統 . . . . . . . . . . . . . . 53
4.5.1 實驗模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5.2 評量方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.5.3 實驗結果及分析 . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.6 生成文句之實例 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.7 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
五、結論與展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.1 總結討論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 未來展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
參考文獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
[1] John Macnamara and Seymour L Kushnir, “Linguistic independence of bilinguals: The input switch,” Journal of Memory and Language, vol. 10, no. 5, pp. 480, 1971.
[2] Carol Myers-Scotton, Social motivations for codeswitching: Evidence from Africa, Oxford University Press, 1995.
[3] Paul McNamee, “Language identification: a solved problem suitable for undergraduate instruction,” Journal of Computing Sciences in Colleges, vol. 20, no. 3, pp. 94–101, 2005.
[4] Liling Tan, Marcos Zampieri, Nikola Ljubešic, and Jörg Tiedemann, “Merging comparable data sources for the discrimination of similar languages: The dsl corpus collection,” in Proceedings of the 7th Workshop on Building and Using Comparable Corpora (BUCC), 2014, pp. 11–15.
[5] Jun Du, Yan-Hui Tu, Lei Sun, Feng Ma, Hai-Kun Wang, Jia Pan, Cong Liu, JingDong Chen, and Chin-Hui Lee, “The ustc-iflytek system for chime-4 challenge,” Proc. CHiME, pp. 36–38, 2016.
[6] Alex Wang, Amapreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman, “Glue: A multi-task benchmark and analysis platform for natural language understanding,” arXiv preprint arXiv:1804.07461, 2018.
[7] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, “Bert: Pretraining of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
[8] Ching-Feng Yeh and Lin-Shan Lee, “An improved framework for recognizing highly imbalanced bilingual code-switched lectures with cross-language acoustic modeling and frame-level language identification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 23, no. 7, pp. 1144–1159, 2015.
[9] Ching Feng Yeh, Chao Yu Huang, Liang Che Sun, and Lin Shan Lee, “An integrated framework for transcribing mandarin-english code-mixed lectures with improved acoustic and language modeling,” in Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on. IEEE, 2010, pp. 214–219.
[10] Carol W Pfaff, “Constraints on language mixing: intrasentential code-switching and borrowing in spanish/english,” Language, pp. 291–318, 1979.
[11] Rakesh M Bhatt, “Code-switching and the functional head constraint,” in Janet Fuller et al. Proceedings of the Eleventh Eastern States Conference on Linguistics. Ithaca, NY: Department of Modern Languages and Linguistics, 1995, pp. 1–12.
[12] Ying Li and Pascale Fung, “Code-switch language model with inversion constraints for mixed language speech recognition,” Proceedings of COLING 2012, pp. 1671– 1680, 2012.
[13] Ying Li and Pascale Fung, “Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 7368–7372.
[14] LI Ying and Pascale Fung, “Language modeling with functional head constraint for code switching speech recognition,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 907–916.
[15] Heike Adel, Ngoc Thang Vu, Katrin Kirchhoff, Dominic Telaar, and Tanja Schultz, “Syntactic and semantic features for code-switching factored language models,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 23, no. 3, pp. 431–440, 2015.
[16] Saurabh Garg, Tanmay Parekh, and Preethi Jyothi, “Dual language models for code mixed speech recognition,” arXiv preprint arXiv:1711.01048, 2017.
[17] E Yilmaz, H Heuvel, and DA van Leeuwen, “Acoustic and textual data augmentation for improved asr of code-switching speech,” in Proc. Interspeech. Hyderabad, India: ISCA, 2018, pp. 1933–1937.
[18] Saurabh Garg, Tanmay Parekh, and Preethi Jyothi, “Code-switched language models using dual rnns and same-source pretraining,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, pp. 3078–3083.
[19] Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, and Pascale Fung, “Learn to code-switch: Data augmentation using copy mechanism on language modeling,” arXiv preprint arXiv:1810.10254, 2018.
[20] Charles C Fries and Kenneth L Pike, “Coexistent phonemic systems,” Language, vol. 25, no. 1, pp. 29–50, 1949.
[21] Uriel Weinreich, Languages in Contact. The Hague: Mouton., Walter de Gruyter, 1953.
[22] Rodolfo Jacobson, The broadening spectrum of a Malaysian experience: From informal codemixing to formal codeswitching, Dewan Bahasa dan Pustaka, 2004.
[23] Brian Goldstein and Kathryn Kohnert, “Speech, language, and hearing in developing bilingual children: Current findings and future directions,” Language, Speech, and Hearing Services in Schools, vol. 36, no. 3, pp. 264–267, 2005.
[24] Alejandro Brice and Roanne Brice, Language development: Monolingual and bilingual acquisition, Jeffery W. Johnston, 2009.
[25] Thamar Solorio and Yang Liu, “Learning to predict code-switching points,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2008, pp. 973–981.
[26] Shana Poplack, “Sometimes i’ll start a sentence in spanish y termino en espanol: toward a typology of code-switching1,” Linguistics, vol. 18, no. 7-8, pp. 581–618, 1980.
[27] Carol Myers-Scotton, Duelling languages: Grammatical structure in codeswitching, Oxford University Press, 1997.
[28] Shikaripur N Sridhar and Kamal K Sridhar, “The syntax and psycholinguistics of bilingual code mixing,” Canadian Journal of Psychology/Revue canadienne de psychologie, vol. 34, no. 4, pp. 407, 1980.
[29] Eyamba G Bokamba, “Code-mixing, language variation, and linguistic theory:: Evidence from bantu languages,” Lingua, vol. 76, no. 1, pp. 21–62, 1988.
[30] Pieter Muysken, Carmen Pena Díaz, Pieter Cornelis Muysken, et al., Bilingual speech: A typology of code-mixing, vol. 11, Cambridge University Press, 2000.
[31] Özlem Çetinoğlu, Sarah Schulz, and Ngoc Thang Vu, “Challenges of computational processing of code-switching,” in Proceedings of the Second Workshop on Computational Approaches to Code Switching, 2016, pp. 1–11.
[32] Cheryl Wei-Yu Chen, “The mixing of english in magazine advertisements in taiwan,” World Englishes, vol. 25, no. 3-4, pp. 467–478, 2006.
[33] John R Koza, Forrest H Bennett, David Andre, and Martin A Keane, “Automated design of both the topology and sizing of analog electrical circuits using genetic programming,” in Artificial Intelligence in Design’96, pp. 151–170. Springer, 1996.
[34] Arthur L Samuel, “Some studies in machine learning using the game of checkers,” IBM Journal of research and development, vol. 3, no. 3, pp. 210–229, 1959.
[35] Warren S McCulloch and Walter Pitts, “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics, vol. 5, no. 4, pp. 115– 133, 1943.
[36] Vinod Nair and Geoffrey E Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807–814.
[37] Arthur Earl Bryson, Applied optimal control: optimization, estimation and control, Routledge, 2018.
[38] David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams, “Learning representations by back-propagating errors,” nature, vol. 323, no. 6088, pp. 533, 1986.
[39] John Duchi, Elad Hazan, and Yoram Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research, vol. 12, no. Jul, pp. 2121–2159, 2011.
[40] Diederik P Kingma and Jimmy Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[41] C Lee Giles, Gary M Kuhn, and Ronald J Williams, “Dynamic recurrent neural networks: Theory and applications,” IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 153–156, 1994.
[42] Mike Schuster and Kuldip K Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, 1997.
[43] Sepp Hochreiter, “Untersuchungen zu dynamischen neuronalen netzen,” Diploma, Technische Universität München, vol. 91, no. 1, 1991.
[44] Yoshua Bengio, Patrice Simard, and Paolo Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.
[45] Sepp Hochreiter and Jürgen Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[46] Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever, “An empirical exploration of recurrent network architectures,” in International Conference on Machine Learning, 2015, pp. 2342–2350.
[47] Alex Graves and Jürgen Schmidhuber, “Framewise phoneme classification with bidirectional lstm and other neural network architectures,” Neural Networks, vol. 18, no. 5-6, pp. 602–610, 2005.
[48] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680.
[49] Mehdi Mirza and Simon Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
[50] Richard S Sutton and Andrew G Barto, Reinforcement learning: An introduction, MIT press, 2018.
[51] Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Advances in neural information processing systems, 2000, pp. 1057–1063.
[52] R Williams, “A class of gradient-estimation algorithms for reinforcement learning in neural networks,” in Proceedings of the International Conference on Neural Networks, 1987, pp. II–601.
[53] Ronald J Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine learning, vol. 8, no. 3-4, pp. 229–256, 1992.
[54] Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin, “A neural probabilistic language model,” Journal of machine learning research, vol. 3, no. Feb, pp. 1137–1155, 2003.
[55] Stanley F Chen and Joshua Goodman, “An empirical study of smoothing techniques for language modeling,” Computer Speech & Language, vol. 13, no. 4, pp. 359–394, 1999.
[56] Reinhard Kneser and Hermann Ney, “Improved backing-off for m-gram language modeling,” in icassp, 1995, vol. 1, p. 181e4.
[57] Holger Schwenk, “Continuous space language models,” Computer Speech & Language, vol. 21, no. 3, pp. 492–518, 2007.
[58] Geoffrey E Hinton et al., “Learning distributed representations of concepts,” in Proceedings of the eighth annual conference of the cognitive science society. Amherst, MA, 1986, vol. 1, p. 12.
[59] Tomáš Mikolov, Martin Karafiát, Lukáš Burget, Jan Černockỳ, and Sanjeev Khudanpur, “Recurrent neural network based language model,” in Eleventh Annual Conference of the International Speech Communication Association, 2010.
[60] Douglas A Reynolds, Richard C Rose, et al., “Robust text-independent speaker identification using gaussian mixture speaker models,” IEEE transactions on speech and audio processing, vol. 3, no. 1, pp. 72–83, 1995.
[61] George E Dahl, Dong Yu, Li Deng, and Alex Acero, “Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition,” IEEE Transactions on audio, speech, and language processing, vol. 20, no. 1, pp. 30–42, 2012.
[62] Miao-Ru Wu and Lin-Shan Lee, Initial study on Chinese/English bilingual speech recognition based on lecture recording, Ph.D. thesis, MS thesis, NTU, 2007.
[63] Dau-Cheng Lyu, Tien-Ping Tan, Eng Siong Chng, and Haizhou Li, “Seame: a mandarin-english code-switching speech corpus in south-east asia,” in Eleventh Annual Conference of the International Speech Communication Association, 2010.
[64] Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu, “Seqgan: Sequence generative adversarial nets with policy gradient.,” in AAAI, 2017, pp. 2852–2858.
[65] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
[66] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002, pp. 311–318.
[67] Andreas Stolcke, “Srilm-an extensible language modeling toolkit,” in Seventh international conference on spoken language processing, 2002.
[68] Chin-Yew Lin, “Rouge: A package for automatic evaluation of summaries,” Text Summarization Branches Out, 2004.
[69] John M Conroy and Hoa Trang Dang, “Mind the gap: Dangers of divorcing evaluations of summary content from linguistic quality,” in Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2008, pp. 145–152.
[70] Chia-Wei Liu, Ryan Lowe, Iulian V Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau, “How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation,” arXiv preprint arXiv:1603.08023, 2016.
[71] Prasad Kawthekar, Raunaq Rewari, and Suvrat Bhooshan, “Evaluating generative models for text generation,” .
[72] Stanley F Chen, Douglas Beeferman, and Ronald Rosenfeld, “Evaluation metrics for language models,” in DARPA Broadcast News Transcription and Understanding Workshop. Citeseer, 1998, pp. 275–280.
[73] BY Liang, Acoustic models for continuous mandarin speech recognition, Ph.D. thesis, MS thesis, National Taiwan Univ., Taipei, Taiwan, 1998.
[74] Olli Viikki, David Bye, and Kari Laurila, “A recursive feature vector normalization approach for robust speech recognition in noise,” in Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on. IEEE, 1998, vol. 2, pp. 733–736.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊