跳到主要內容

臺灣博碩士論文加值系統

(35.175.191.36) 您好!臺灣時間:2021/08/01 00:54
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:韓天賜
研究生(外文):Matthew Harris
論文名稱:應用語彙語法範本與相對位置模型於英中語言遷移修正之研究
論文名稱(外文):English-Chinese Language Transfer Correction Incorporating Lexico-Syntactic Templates and Relative Position Modelling
指導教授:吳宗憲吳宗憲引用關係
指導教授(外文):Chung-Hsien Wu
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:英文
論文頁數:97
中文關鍵詞:第二語言學習語言模型相對位置模型語彙語法範本語言遷移修正
外文關鍵詞:Language ModellingSecond-language learnersComputer-Assisted Language LearningLanguage Transfer CorrectionLexico-Syntactic TemplatesRelative Position Modelling
相關次數:
  • 被引用被引用:0
  • 點閱點閱:137
  • 評分評分:
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:0
語言遷移為第二語言學習者受其母語影響,因而使其所發出之語句產生變異之現象。語言遷移會導致學習者的句子產生各種類型的錯誤,如不正確的詞序、錯誤的選詞、多出冗詞或是遺漏必要的字詞。
大部份近年來的研究將重心置於以英語作為第二語言所產生之錯誤的修正。即便中文逐漸成為受到歡迎的外國語言,關於中文學習者所產生之錯誤的研究仍然受到較少關注。
本研究討論了英中語言遷移的問題,並且提出了對於這些錯誤的修正技術。所討論的修正技術使用了兩種新的語言模型技術—相對位置語言模型與語彙語法範本模型—以便能夠妥善處理傳統的模型應用於錯誤修正時之限制。
對於所提出之修正程序的評估,顯示了以語言遷移作為前提,可以作為一種修正中文學習者所產生之錯誤句子的有效方法。結果同樣也顯示出本研究所提出的兩個模型,由修正程序得出最佳候選句子之效果較現有方法為佳。
Language transfer is the phenomenon by which utterances produced by second-language learners are subject to the influence of their first language. This can result in errors being introduced into a sentence, such as incorrect word order, wrong lexical choice, the inclusion of redundant words, or the omission of necessary words.
Much recent work has focussed on the correction of errors in English as a Second Language. However, despite the growth of Mandarin Chinese as a popular foreign language, there has been relatively little research on errors made by learners of Chinese.
This study presents a discussion of the problem of English-Chinese language transfer and proposes techniques for the automatic correction of such errors. Two new language modelling techniques, a Relative Position Language Model and a Lexico-Syntactic Model are introduced to help overcome some of the limitations of traditional models.
Evaluations on the proposed correction procedure show that using the premise of language transfer can be an effective way to correct error sentences produced by Chinese learners. It is also demonstrated that the two proposed models can outperform existing approaches in their ability to find optimal candidates produced by the correction procedure.
List of Figures x
List of Tables xi
Chapter 1 Introduction 1
1.1 Background 1
1.1.1 Second-Language Acquisition 1
1.1.2 Natural Language Processing 1
1.1.3 The Importance of Accuracy 2
1.2 Research Motivation and Goal 2
1.2.1 Chinese Error Detection 2
1.2.2 Language Transfer 3
1.2.3 Specific Aims 4
1.3 Related Works 5
1.3.1 Error Correction for English-Chinese Language Transfer 5
1.3.2 Automatic Correction for ESL Errors 5
1.3.3 Error Correction using Machine Translation 6
1.4 Methodology 7
1.4.1 General Approach 7
1.4.2 System Overview 8
Chapter 2 Language Transfer Corpora 10
2.1 Error Corpus Generation 10
2.1.1 Overview 10
2.1.2 Source Corpus for Error Sentence Generation 11
2.1.3 The Simulation Process 13
2.2 Real Data 15
2.3 Corpora Annotation 15
Chapter 3 Sentence Fluency Measures 18
3.1 Sentence Fluency 18
3.2 N-Gram Language Model 18
3.3 Relative Position Language Model (RPLM) 20
3.3.1 Rationale 20
3.3.2 Definition 21
3.3.3 Discussion 23
3.4 Probabilistic Context-Free Grammar 25
3.4.1 Limitations of Linear Language Models 25
3.4.2 Context-Free Grammars 25
3.4.3 Probabilistic Context-Free Grammars 26
3.5 Lexico-Syntactic (Parse Template) Model 27
3.5.1 Rationale 27
3.5.2 Definition 28
3.5.3 Discussion 33
Chapter 4 Error Detection 34
4.1 Rationale 34
4.2 Approach 34
4.3 Implementation 34
4.3.1 Support Vector Machines 34
4.3.2 Features 35
4.3.3 Training 36
4.3.4 Testing 36
4.3.5 Evaluation Result 37
Chapter 5 Error Correction 38
5.1 Overview 38
5.1.1 Sentence Classification 38
5.1.2 Interplay between Error Types 38
5.1.3 Assumptions 39
5.1.4 Basic Approach 40
5.2 Tackling Word Order Errors 40
5.2.1 A Note on Incidence 40
5.2.2 Approach 41
5.3 Candidates and Culprits 44
5.3.1 Locating Lexical Choice and Redundancy Error Culprits 44
5.3.2 Correcting Lexical Choice and Redundancy Error Culprits 47
5.3.3 Detecting and Correcting Omission Errors 52
5.4 General Correction Procedure 54
5.4.1 Algorithms 54
5.4.2 Pruning Candidates 55
5.5 Ranking Candidates 56
5.5.1 Combined Fluency Measure (CFM) 56
Chapter 6 Evaluation 58
6.1 Overview 58
6.2 Setup 59
6.2.1 Model Training 59
6.2.2 Test Data 60
6.2.3 Sentence Correction Evaluation Metric 61
6.3 Model Evaluation for Word Order Problem 62
6.3.1 Overview 62
6.3.2 Real Data 63
6.3.3 Simulated Data 63
6.4 Evaluation on Lexical Reselection 65
6.5 Evaluation on Identifying Redundant Words 66
6.6 Evaluation on Identifying Insertion Candidates 68
6.7 Evaluation on General Correction Procedure 70
6.7.1 Real Data 70
6.7.2 Simulated Data 71
6.7.3 Combined Fluency Measure 72
6.7.4 Leave-One-Out Evaluation for Each Error Correction Technique 75
6.7.5 Machine Translation 77
6.8 Summary 79
Chapter 7 Conclusion and Future Work 80
7.1 Conclusion 80
7.2 Future Work 81
Bibliography 82
Appendix A Error Corpus Generation 84
A.1 Generation Procedure 84
A.2.1 Giza++ Training 84
A.2.2 Direct Lookup 85
A.2.3 Cross-Lookup 87
A.2.4 Sentence Alignment 88
A.2.5 Lexical Variation 89
A.2.6 Evaluation 90
A.3 Error Corpus Filtering 93
Appendix B Stopword List 96
Beers, S. and W. Nagy (2007). Syntactic Complexity as a Predictor of Adolescent Writing Quality: Which Measures? Which Genre? Reading and Writing, Springer Netherlands.
Bender, E. M., D. Flickinger, et al. (2004). Arboretum: Using a Precision Grammar for Grammar Checking in CALL. ICALL-2004, ISCA.
Bowden, M. I. and R. K. Fox (2002). A Diagnostic Approach to the Detection of Syntactic Errors in English for Non Native Speakers. Department of Computer Science Technical Report, The University of Texas.
Brockett, C., W. B. Dolan, et al. (2006). Correcting ESL Errors Using Phrasal SMT Techniques. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, Association for Computational Linguistics Morristown, NJ, USA: 249-256.
Brown, P. F., J. Cocke, et al. (1990). A Statistical Approach to Machine Translation. Computational Linguistics, MIT Press Cambridge, MA, USA. 16: 79-85.
Carnie, A. (2002). Syntax: A Generative Introduction, Blackwell Publishers.
Chodorow, M. and C. Leacock (2000). An Unsupervised Method for Detecting Grammatical Errors. Proceedings of the First Conference on North American Chapter of the ACL, Association for Computational Linguistics. 4: 140-147.
Chomsky, N. (1956). Three Models for the Description of Language. Information Theory, IEEE Transactions 2: 113-124.
Doddington, G. (2002). Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics, Morgan Kaufmann Publishers Inc. San Francisco, CA, USA: 138-145.
Graddol, D. (2000) "The Future of English? A guide to forecasting the popularity of the English language in the 21st century." from http://www.britishcouncil.org/de/learning-elt-future.pdf.
Joachims, T. (1999). Making Large-Scale Support Vector Machine Learning Practical. Advances in Kernel Methods: Support Vector Learning, MIT Press: 169-184.
Jurafsky, D. and J. H. Martin (2008). Speech and Language Processing, Prentice-Hall, Inc. Upper Saddle River, NJ, USA.
Koehn, P., H. Hoang, et al. (2007). Moses: Open Source Toolkit for Statistical Machine Translation. Proceedings of the ACL 2007 Demo and Poster Sessions, Association for Computational Linguistics: 177-180.
Lee, J. and S. Seneff (2006). Automatic Grammar Correction for Second-Language Learners. INTERSPEECH-2006, ISCA.
Manning, C. D. and H. Schütze (1999). Foundations of Statistical Natural Language Processing, MIT Press.
Mutton, A., M. Dras, et al. (2007). GLEU: Automatic Evaluation of Sentence-Level Fluency, Association for Computational Linguistics: 344-351.
Och, F. J. and H. Ney (2003). "A Systematic Comparison of Various Statistical Alignment Models." Computational Linguistics 1(29): 19-51.
Odlin, T. (1989). Language Transfer: Cross-Linguistic Influence in Language Learning, Cambridge University Press.
Papineni, K., S. Roukos, et al. (2001). BLEU: a Method for Automatic Evaluation of Machine Translation, Association for Computational Linguistics Morristown, NJ, USA: 311-318.
Porter, M. F. (1980). An Algorithm for Suffix Stripping. Program, 14(3) pp. 130-137.
Tsuruoka, Y. (2005). Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data, Association for Computational Linguistics Morristown, NJ, USA: 467-474.
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory, Springer-Verlag New York, Inc. New York, NY, USA.
Wang, C., M. Collins, et al. (2007). Chinese Syntactic Reordering for Statistical Machine Translation. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Language Learning, Association for Computational Linguistics: 737-745.
Wang, Y. and R. Garigliano (1992). An Intelligent Language Tutoring System for Handling Errors Caused by Transfer. Proceedings of the Second International Conference on Intelligent Tutoring Systems, Springer: 395-404.
Xinhua. (2006). "汉语水平考试中心:2005年外国考生总人数近12万." From http://www.gov.cn/jrzg/2006-01/16/content_160707.htm.
Yamada, K. and K. Knight (2001). A Syntax-based Statistical Translation Model. Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics Morristown, NJ, USA: 523-530.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top