跳到主要內容

臺灣博碩士論文加值系統

(44.192.48.196) 您好!臺灣時間:2024/06/16 10:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:粘子奕
研究生(外文):Nien, Tzu-yi
論文名稱:透過階層式翻譯分類擴充雙語WordNet
論文名稱(外文):Extending Bilingual WordNet via Hierarchical Word Translation Classification
指導教授:張俊盛張俊盛引用關係張智星張智星引用關係
指導教授(外文):Chang, Jason S.Jang, Jyh-Shing Roger
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學門:電算機學門
學類:系統設計學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:英文
論文頁數:55
中文關鍵詞:翻譯詞意選擇字詞歧異辨識雙語WordNet最大熵值模型
外文關鍵詞:word translation classificationword sense disambiguationbilingual WordNetmaximum entropy model
相關次數:
  • 被引用被引用:0
  • 點閱點閱:239
  • 評分評分:
  • 下載下載:36
  • 收藏至我的研究室書目清單書目收藏:0
本論文描述一自動分類方法,為雙語資源(例如雙語辭典)中的詞彙翻譯配對選擇適當的詞義,進而擴充現有雙語WordNet的詞彙涵蓋範圍。此方法對給定之詞彙與翻譯,自動由廣義而狹義尋訪WordNet中的下位詞階層(hyponym hierarchy),透過逐步選擇適當的下位詞類別以減低其詞意歧異度。我們為每個可能出現詞義分歧的下位詞階層節點建構對應的分類模型;我們使用現有的雙語WordNet進行訓練,使各模型學習其下位詞詞彙翻譯的共同特徵,使得在執行階段,分類器可以透過特徵比對,選擇較為適切的下位詞節點。此外,我們也建構一個分類篩選模型,用以濾除較為不可能的詞義,提高系統的速度與精確度。實驗結果顯示,此系統能夠有效的為給定詞彙翻譯選擇正確的WordNet詞義。此分類結果將可當作系統的訓練資料,重新訓練分類模型,亦或將其與機器翻譯系統結合,使得機器翻譯系統能夠更精確的根據語意產生翻譯。
We introduce a method for leaning to assign word senses to bilingual translation pairs. In our approach, this problem is transformed into a problem on how to navigate through a sense network (e.g., WordNet) aimed at relating the features of translations to the sense nodes in the network. The method involves automatically constructing classification models for each branched nodes in the sense network and learning to reject less probable sense categories for the translations based on the translation characteristics of semantically related word groups (e.g., words in a lexical category). At run-time, given translations are expanded with their synonyms and the sense ambiguity is resolved according to the trained classification models. Evaluation shows that the method significantly outperforms the strong baseline of assigning most frequent sense to the translation pairs. Our method effectively determines adequate word senses for given word-translation pairs, suggesting the possibility of using our methods as computer-assisted tool for lexicography or of using our method to assist machine translation systems in word selection.
摘要 i
Abstract ii
Acknowledgement iii
Table of Contents iv
List of Figures v
List of Tables vi
CHAPTER 1 Introduction 1
CHAPTER 2 Related Work 5
CHAPTER 3 Hierarchical Word Translation Classification 9
3.1 Problem Statement 9
3.2 Learning to Classify Translations 10
3.2.1 Propagating Translations 11
3.2.2 Training Hierarchical Word Translation Classification Models 15
3.2.3 Training Filtering Model 18
3.3 Run-Time Translation Classification 21
CHAPTER 4 Experimental Setting 24
4.1 Data Set 24
4.2 Methods Compared 27
4.3 Evaluation Metrics 29
4.4 Tuning Parameters 30
CHAPTER 5 Evaluation Results and Discussion 34
5.1 Experimental Results 34
5.2 Error Analysis 37
CHAPTER 6 Future Work and Summary 41
References 43
Appendix A - WordNet Lexicographer File Names 46
Appendix B - Evaluation Data 47
Appendix C - Expandable Translation Pairs in Dev. Set 55
References
Agirre, E., and Rigau, G. (1996). Word Sense Disambiguation using Conceptual Density. 16th Conference on Computational Linguistics, (pp. 16-22). Copenhagen.
Banerjee, S., and Pedersen, T. (2002). An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. the Third International Conference on Intelligent Text Processing and Computational Linguistics. Mexico City.
Black, E. W. (1988). An Experiment in Computational Discrimination of English Word Senses. IBM Journal of Research and Development , 185-194.
Bruce, R., and Wiebe, J. (1994). Word-Sense Disambiguation Using Decomposable Models. 32nd Annual Meeting of the Association for Computational Linguistics (pp. 139-146). Las Cruces: Association for Computational Linguistics.
Carpaut, M., and Wu, D. (2007). Improving Statistical Machine Translation using Word Sense Disambiguation. 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 61-72). Prague: Association for Computational Linguistics.
Chan, Y. S., Ng, H. T., and Chiang, D. (2007). Word Sense Disambiguation Improves Statistical Machine Translation. the Association for Computational Linguistics (ACL), (pp. 33-40).
Chang, J. S., Lin, T., You, G.-N., Chuang, T. C., and Hsieh, C.-T. (2003). Building a Chinese WordNet via Class-based Translation Model. Computational Linguistics and Chinese Language Processing , 61-76.
Diab, M., and Resnik, P. (2002). An Unsupervised Method for Word Sense Tagging using Parallel Corpora. the 40th Annual Meeting of the Association for Computational Linguistics (ACL), (pp. 255-262). Philadelphia.
Gale, W. A., Church, K. W., and Yarowsky, D. (1992). Using Bilingual Materials to Develop Word Sense Disambiguation Methods. the International Conference on Theoretical and Methodological Issues in Machine Translation, (pp. 101-112).
Galley, M., and McKeown, K. (2003). ImprovingWord Sense Disambiguation in Lexical Chaining. 18th International Joint Conference on Artificial Intelligence (IJCAI 2003). Acapulco.
Hamp, B., and Feldweg, H. (1997). GermaNet - a Lexical-Semantic Net for German. ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, (pp. 9-15). Madrid.
Hearst, M. A. (1991). Noun Homograph Disambiguation using Local Context in Large Corpora. 7th Annual Conference of the University of Waterloo Centre for the New OED and Text Research, (pp. 1-15).
Hsieh, C.-T. (2000). Semi-Automatic Construction of Chinese WordNet - Using Class-based Translation Model.
Huang, C.-C., Tseng, C.-H., Kao, K. H., and Chang, J. S. (2008). A Thesaurus-based Semantic Classification of English Collocations. ROCLING 2008, (pp. 38-52). Taipei.
Huang, C.-R., Chang, R.-Y., and Lee, H.-P. (2004). Sinica BOW (Bilingual Ontological Wordnet): Integration of Bilingual WordNet and SUMO. 4th International Conference on Language Resources and Evaluation (LREC2004), (pp. 1553-1556). Lisbon.
Leacock, C., Towell, G., and Voorhees, E. (1993). Corpus-based Statistical Sense Resolution. ARPA Human Language Technology Workshop, (pp. 260-265).
Lesk, M. (1986). Automatic Sense Disambiguation using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. 5th Annual International Conference on Systems Documentation (pp. 24-26). Toronto: Association for Computing Machinery.
Longman Group. (1992). Longman English-Chinese Dictionary of Contemporary English. Hong Kong: Longman Group (Far East) Ltd.
Mihalcea, R., and Moldovan, D. I. (1999). A Method for Word Sense Disambiguation of Unrestricted Text. the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics (pp. 152-158). College Park: Association for Computational Linguistics.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. J. (1990). Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography , pp. 235-244.
Pasca, M., and Harabagiu, S. M. (2001). The Informative Role of WordNet in Open-Domain Question Answering. NAACL 2001 Workshop on WordNet and Other Lexical Resources: Applications, Extensions, and Customizations, (pp. 138-143). Pittsburgh.
Towell, G., and Voorhees, E. M. (1998). Disambiguating Highly Ambiguous Words. Computational Linguistics , 125-145.
Voorhees, E. M., and Tice, D. M. (1999). The TREC-8 Question Answering Track Evaluation. TREC-8, (pp. 84-106).
Vossen, P. (1998). Introduction to EuroWordNet. Computers and the Humanities , 73-89.
Wible, D., and Kuo, C.-H. (2001). A Syntax-Lexical Semantics Interface Analysis of Collocation Errors. Pacific Second Language Research Forum.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top