跳到主要內容

臺灣博碩士論文加值系統

(44.220.251.236) 您好!臺灣時間:2024/10/09 09:27
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張嘉銘
研究生(外文):Jia-Ming Chang
論文名稱:片語翻譯模型為本之雙語名詞片語擷取
論文名稱(外文):Bilingual Noun Phrase Extraction With Phrase-Based Translation Model
指導教授:張俊盛張俊盛引用關係
指導教授(外文):Jason S. Chang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:英文
論文頁數:58
中文關鍵詞:名詞片語統計式機器翻譯平行語料庫
外文關鍵詞:noun phrasestatistical machine translationparallel corpus
相關次數:
  • 被引用被引用:1
  • 點閱點閱:247
  • 評分評分:
  • 下載下載:21
  • 收藏至我的研究室書目清單書目收藏:0
在本論文中,我們提出一個從平行語料擷取名詞片語翻譯的新方法。我們的方法首先利用名詞片語辨識工具從原文句子擷取出所有可能的名詞片語。針對每一個名詞片語,我們利用現有的單字對應工具找到它在目標句的部分翻譯。接著,我們以部分翻譯為中心點,產生各種包含中心點的可能翻譯。最後,我們利用一個片語翻譯模型從中挑選出最有可能的翻譯。此片語翻譯模型包含兩個機率,分別是詞彙翻譯機率與孳生機率。詞彙翻譯機率用來計算單字間相關程度,而孳生機率則表示來源字翻譯後的字數長度機率。我們會在訓練階段分別利用EM演算法與一部機率辭典來訓練這兩組參數。我們實際撰寫了程式,以74萬句香港新聞為語料,與IBM Model4在名詞片語擷取的效能上進行比較。實驗的結果我們獲得了70%的準確率以及61%的召回率。實驗顯示我們的方法勝過IBM modle4,也說明了我們提出的新方法的確可以改善名詞片語翻譯擷取與機器翻譯中名詞片語的效率與品質。
We propose a new method for extracting noun phrase correspondence automatically from a sentence-aligned bilingual corpus. In our approach, noun phrases extracted from each source language sentence are aligned to phrases in each target language sentence based on a phrase translation model and maximum translation probability. The method involves generating word level alignment using existing word alignment technique as the basis of noun phrase alignment, and estimating Lexical Translation Probability (LTP) for noun phrases by using the EM algorithm and estimating Fertility Probability (FP) from a Most Frequency Translation Equivalent (MFTE). At runtime, for each noun phrase in the source sentence, partial translation in the target sentence is located. Then, each of the n-grams containing the partial translation is evaluated using phrase translation probability. The n-gram with maximum translation probability is chosen as the output. We describe the implementation of the method using bilingual Hong Kong news corpus. The experimental results show that our model outperforms IBM model4 in terms of precision rate of noun phrase extraction. The methodology cleanly improves the performance of noun phrase translation, which has been shown to be very crucial for statistical machine translation.
摘要
ABSTRACT
致謝辭
Table of Contents
List of Tables
List of Figures
Chapter 1 Introduction
Chapter 2 Related Work
Chapter 3 Phrase Translation Model
3.1 Problem Statement
3.2 Phrase-Based Translation Model
3.3 Training the Phrase Translation Model
3.3.1 Data Handling for Training Data
3.3.2 Estimate Lexical Translation Probability
3.3.3 Estimate Fertility Probability according to a MFTE
3.3.4 Estimate Null Probability
3.4 Runtime Noun Phrase Correspondence Extraction
Chapter 4 Experiments and Analysis
4.1 Training the Phrase Translation Model
4.2 Test data and Evaluation
4.2.1 Evaluation for locating pivot
4.2.2 Evaluation for extracting noun phrase correspondence
Chapter 5 Future Work and Conclusion
Rerferences
Appendix A - Test Set for Phrase Correspondence Extraction
Brown, Peter F.; Cocke, John; Della Pietra, Stephen A.; Della Pietra, Vincent J.; Jelinek, Frederick; Lafferty, John D.; Mercer, Robert L. and Roossin, Paul S.: 1990, `A statistical approach to machine translation`, in Computational Linguistics, volume 16(2): 79–85.

Yunbo Cao and Hang Li: 2002, `Base Noun Phrase Translation Using Web Data and the EM Algorithm`, in Proceedings of COLING 2002, pp. 127-133.

Catizone, R., G.Russell, and S. Warwick: 1989, `Deriving translation data from bilingual texts`, in Proceedings of the First International Lexical Acquisition Workshop, Detroit, USA.

David Chiang: 2005, `A Hierarchical Phrase-Based Model for Statistical Machine Translation`, in Proceedings of ACL-2005, pp. 263–270.

A. P. Dempster, N. M. Laird, and D. B. Rubin: 1977, `Maximum likelihood from incomplete data via the EM algorithm`, Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1-38.

Dong-Hui Feng, Ya-Juan Lv, Ming Zhou: 2004, "A New Approach for English-Chinese Named Entity Alignment,” in Proceedings of the Conference on EMNLP.

Gale, W., and K. Church: 1991, `Identifying word correspondences in parallel texts`, in Proceeding of Speech and Natural Language Workshop, pp. 152–157

W.John Hutchins: 1995, `Machine translation: A brief history`, in E.F.K. Koerner and R.E. Asher, editors, Concise history of the language sciences: from the Sumerians to the cognitivists, pages 431-445. Pergamon Press, Oxford, 1995.

Fei Huang, Stephan Vogel and A. Waibel: 2003, `Automatic Extraction of Named Entity Translingual Equivalence Based on Multi-feature Cost Minimization`, in Proceedings of ACL2003 Workshop, pp. 9-16.

Kenji Imamura: 2002, `Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based MT`, in Proceedings of TMI-2002, pp. 74–84.

Jian, J.-Y., Chang, Y.-C., and Chang, J.-S: 2004, `Collocational Translation Memory Extraction Based on Statistical and Linguistic Information.`, in ROCLING XV (ROCLING 2004)I, Taipei, Taiwan

Hiroyuki Kaji, Y. Kida, and Y. Morimoto: 1992, `Learning Translation Templates from Bilingual Text`, in Proceedings of COLING 1992, volume 2, pp. 672–678.

Koehn, P., and K. Knight: 2003, `Feature-rich Statistical Translation of Noun Phrases`, in Proceedings of ACL-2003, pp. 311–318.

Koehn, P., F. J. Och, and D. Marcu: 2003, `Statistical Phrase-Based Translation`, in Proceedings of HLT/NAACL-2003, pp.127–133.

Kumano, A., and H. Hirakawa: 1994, `Building an MT dictionary from parallel texts based on linguistic and statistical information`, in Proceedings of COLING 1994, pp. 76–81.

Kupiec, J.,: 1993, `An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora`, in Proceedings of ACL-1993, pp. 23–30

Chun-Jen Lee, Jason S. Chang, Jyh-Shing Roger Jang: 2005, `Named Entity Alignment: An Approach of Combining Statistical Models and Knowledge Information`, A Thesis Presented to the National Tsing Hua University for the Degree Doctor of Computer Science, pp. 1–128.

Marcu, D., and W. Wong: 2002, `A Phrase-Based, Joint Probability Model for Statistical Machine Translation`, in Proceedings of EMNLP-2002, pp.133–139.

Melamed, I. D.,: 1995, `Automatic evaluation and uniform filter cascades for inducing N-best translation lexicons`, in Proceedings of the Third Workshop on Very Large Corpora, pp. 184–198.

Meyers, Adam, Michiko Kosaka, and Ralph Grishman: 2000, `Chart-based translation rule application in machine translation`, in Proceedings of COLING-2000, pp. 537–543.

Moore, R. C.,: 2001, `Towards a simple and accurate statistical approach to learning translational relationships among words`, in Proceedings of ACL-2001, pp. 79–86.

Och, F. J., and H. Ney: 2000, `A Comparison of Alignment Models for Statistical Machine Translation`, in Proceedings of COLING 2000, pp. 1086–1090

Och, F. J., C. Tillmann, and H. Ney: 1999: `Improved alignment models for statistical machine translation`, in Proceedings of EMNLP-WVLC 1999, pp. 20–28

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu: 2002, `BLEU: a method for automatic evaluation of machine translation`, in Proceedings of ACL-2002, pp. 311–318

Wu, D., and X. Xia: 1994, `Learning an English-Chinese lexicon from a parallel corpus`, in Proceedings of AMTA-94, pp. 206–213

Yamada, K., and K. Knight: 2001, `A syntax-based statistical translation model`, in Proceedings of ACL-2001, pp. 523–530.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 余強生、曾雍欽,2003,<網際網路購物者特性、購物動機、期望的網站服務與顧客滿意度之間的結構化方程式模型>,《企業管理學報》,第57期,37-64。
2. 李仁芳,1999,<蘇格蘭何以成為歐洲矽谷>,《能力雜誌》,第525期,38-44。
3. 王逸峰,2003,<從觀光行銷計畫的基本概念檢討「觀光客倍增計畫」>,《研考雙月刊》,21-28。
4. 林博文,2002,<地方政府之行銷研究>,《法政學報》,第15期,115-158。
5. 林素吟,2005,<服務品質、滿意度與購買意圖關係之研究:層及干擾回歸分析之應用>,《管理評論》,第二十四卷,第二期,1-17。
6. 許美麗,1999,<電子銀行之行銷組合>,《金融研訓》,15-23。
7. 黃建銘,2000,<公部門行銷模式與策略之探討>,《人力發展月刊》,第七十五期,29-38。
8. 項靖,2000,<線上政府:我國地方政府WWW網站之內涵與演變>,《北大行政暨政策學報》,第二期,41-96。
9. 張重昭、韓維中、張心馨,2003,<服務缺失、顧客歸因與補救回復措施之顧客滿意度模式>,《企業管理學報》,第57期,129-162。
10. 張秀娟,2005,<國立博物館行銷組合之研究>,《中華管理學報》,第六卷,第四期,51-72。
11. 張信雄,2006,<行銷規劃與策略>,《南臺科技大學學報》,第30期,1-10。
12. 劉宜君,2003,<地方政府因應全球化策略之研究--以臺灣臺北市為例>,《競爭力評論》,82-99。
13. 童桂馨、陳美純、葉晉昌、卓靜怡、鄭筱玲、潘緯鈞,2003,<健康休閒俱樂部行銷策略之研究—以亞力山大健康俱樂部為例>,《企銀季刊》,第二十六,卷第四期,101-119。
14. 廖則竣、江志卿,2005,<網站服務品質、顧客滿意度及後續行為意圖之探討:以網路購物為例>,《管理與系統》,第十二卷,第ㄧ期,23-47。