跳到主要內容

臺灣博碩士論文加值系統

(44.200.122.214) 您好!臺灣時間:2024/10/06 02:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:呂慶輝
研究生(外文):Ching-Hui Lu
論文名稱:自動翻譯系統之評價及改良
論文名稱(外文):Improvements in Machine Translation And Evaluation of Machine Translation Quality
指導教授:許永真許永真引用關係
口試委員:劉長遠張智星徐讚昇
口試日期:2016-07-26
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:中文
論文頁數:46
中文關鍵詞:機器翻譯統計式機器翻譯類神經網路翻譯品質評測領域翻譯互動翻譯
外文關鍵詞:machine translationstatistical machine translationartificial neural networkevaluation of translation qualityin-domain translationinteractive translation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:696
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
  全球交流日盛,語言翻譯市場快速成長。在 2008 到 2018 的十年間,預估全球 GDP 平均年增 2.1%,但翻譯市場卻年增 4.7%。而整個語言服務 ( language services ) 市場,更從 2009 年的 239 億美元,增長到 2015 年的 382 億美元,以每年平均 8% 的速度穩定成長。加以網際網路普及,人工翻譯的產能無法滿足需求,使得機器翻譯 ( machine translation; automatic translation ) 的角色越來越重要 -- Google Translate 在 2016 年四月,每日翻譯量已達一千億字1。
  機器翻譯從早期的規則式翻譯 ( Rule-based machine translation ) 演進到近年的統計式翻譯 (Statistical Machine Translation)-- 2007 年 Google Translate 開始使用其自身的統計式翻譯引擎以取代上一代規則式引擎,正式開啟了統計式翻譯時代至今。
  統計式機器翻譯可以(一)從雙語對照的平行語料 ( Bilingual Parallel Corpora) 自動學習翻譯規則,並且(二)自動調整參數,以最佳擬合資料集。然而準備雙語語料的自動句對齊 ( Sentence Alignment ) 過程如有錯誤,將並未對齊的雙語語料送入引擎訓練,會降低引擎的品質。而自動調整參數時,依據的仍是引擎在測試集 ( test set ) 的擬合程度。故上述(一)和(二)共同的核心功能都是評測原文和譯文之間的擬合程度,或說,翻譯品質自動評測 ( Automatic Evaluation of Translation )。
  目前用來評測翻譯品質的通用方法是BLEU (Bilingual Evaluation Understudy) 。BLEU只考慮文本字面的n-gram precision--因此,如果正確的翻譯卻使用了與參考答案句 ( reference sentence ) 完全不同的字詞,則可能在 BLEU 得極低分,甚至零分。而完全錯誤,文法不通的句子,如果包含了幾個關鍵字,卻反而能得較高分。BLEU 這樣的評分方式並不能忠實反應翻譯品質。本文第一個研究重點就是提出基於類神經網路/語意的自動評測方式,能不拘於字面來評定翻譯品質。
  本文創新的訓練方式:對於每一句原文,我們用人工譯文配合機器所譯出的數種較差譯文來造出訓練資料集 ( training dataset ),以類神經網路 ( ANN; Artificial Neural Network ) 來訓練品質分級。實驗結果發現,本系統確實能基於語意而非基於字面來對譯文進行品質評價。
  其次,由於機器翻譯尚未能達到人工翻譯的品質,機譯之後往往需要人工的譯後編輯 (Post-Edit) 。如果機器的譯文極不理想,則譯後編輯所花的時間可能和純人工翻譯相近,甚至更多。因此本文提出人機互助的互動式翻譯 ( Interactive Machine Translation ) 流程--人與機器共同造出譯文,以節省,甚至完全消除譯後編輯的時間。我們使用的方法是先令機器對每一句原文翻譯出最佳 N 個候選句,並以簡短的方式呈現給使用者。一旦使用者修改了譯文中任何一字,本系統即根據改變的字,利用最佳 N 句去自動修改譯文的其他部份。
  本文並提出新的領域翻譯 ( Domain-specific translation ) 方法。同一個英文單字在不同領域裡有不同意思。例如“Movable” 一字在通用領域指『可移動的』,但在法律領域指 『動產』。以各領域的平行語料雖可訓練出針對各該領域的翻譯引擎。但自各領域收集到的語料可能各自未達 300 萬句,不足以單獨訓練出較成熟的引擎。我們提出的方法是從通用領域的大語料訓練翻譯模型,再從特定領域的小語料提取雙語術語對照表。訓練前先將通用語料做前處理,將專門術語代換成特定標記。而在執行翻譯時,再查照各該領域的對照表,以將術語標記翻譯成該領域下的意義。如此可在各該領域語料不足時,仍能提供高品質的領域內機器翻譯。經實驗,我們的系統可依領域設定不同,而將 “Movable” 翻成各該意義。
  最後,整合本文提供的各種方法,實際建構商用的網路翻譯平台。

As global interaction increase, the translation market is rapidly growing. In the decade of 2008 to 2018, the estimated GDP increase is 2.1% per year, but the estimated growth of the translation market is 4.7% per year. In addition, the entire language services market rose from $23.9 billion in 2009 to $38.2 billion in 2015, steadily growing at the rate of 8% per year. Combined with the popularity of the internet, the capacity of human translation struggles to meet the market’s needs, thus the advent of the role of machine translation, or automatic translation – as of April 2016, Google Translate has translated up to one hundred billion words per day1.
Machine translation has since evolved from the early rule-based machine translation to the recent statistical machine translation – in 2007, Google Translated replaced the older rule-based machine translation with its own statistical machine translation, thus opening the new era of statistical machine translation.
Statistical machine translation is capable of a) automatically learning translation rules from bilingual parallel corpora and b) automatically adjusting parameters to fit the dataset. However, if errors appear during the preparatory sentence alignment process, sending in mismatched bilingual corpora, the quality of the engine will be reduced. Furthermore, automatic parameter adjustment still depends on the engine’s degree of fitting in the test set. Thus, the core function of the aforementioned a) and b) is to review the degree of fitting between source text and target text, in other words, “automatic evaluation of translation.”
So far the general method of automatic evaluation of translation is Bilingual Evaluation Understudy (BLEU). BLEU concerns only the n-gram precision of the text, thus, the BLEU score of a correctly translated sentence may be extremely low or even nil if there exists many lexical differences compared to the reference sentence. On the other hand, completely wrong, grammatically inaccurate sentences may achieve a relatively higher score if it contains a few accurate key words. BLEU’s evaluation method cannot accurately assess the quality of translation. Hence, the first aim of this thesis is to propose an automatic evaluation of translation based on artificial neural network / semantics, which is capable of executing translation evaluation that accommodates lexical difference.
This thesis proposes an original training method: We make a dataset by combining human translation and sentences generated by a better translation engine and sentences by a poor engine, and train the dataset for quality classification using artificial neural network (ANN). Experiments show that our system can indeed evaluate translation based on semantics and not lexis.
Secondly, since machine translation often fails to reach the quality of human translation, machine translations often require human post-editing. Thus if the quality of machine translation is extremely poor, the post-editing process may take the same amount of time compared to pure human translation, even more. Thus, this thesis proposes an Interactive Machine Translation process, in which human and machine co-create target texts to reduce or even remove the time required for post-editing. Our method first commands the machine to create N-best translations from the source sentence, and present it in a succinct fashion to the user. Then once the user makes a change in the chosen target sentence, the system will regenrate the rest of the target sentence by looking up N-best sentences according to that change.
This thesis also proposes a new type of Domain-specific translation. The meaning of a same English word may differ in different domains. For example, the meaning of “movable” in general corpora may mean “capable of being moved,” but in the law domain, it means “property or possessions not including land or buildings.” Even though it is possible to train a domain-specific translation engine using domain-specific parallel corpus, each individual domain may contain no more than three million sentences, insufficient if one desires to train a mature engine. Our proposed method is to train translation models from a large corpus belonging to the general domain, then extract bilingual terminology databases from a smaller corpus. We use the a general corpus during preparatory processes, and displace technical terms onto specific tags. Thus even when the specific domain itself provides insufficient corpora, the engine can still generate satisfactory domain-specific machine translation. Our experiment demonstrates that our system is capable of providing different translated versions of the word “movable” when one changes the target domain.
Finally, this research construct an actual translation web-platform for business using the various methods listed above.

序 3
中文摘要 4
英文摘要 6
目錄 9
圖目錄 11
表目錄 12
第一章 緒論 13
1 研究動機 13
2 研究目的 13
第二章 統計式翻譯簡介 14
1 機器翻譯史 14
2 統計式自動翻譯系統的理論基礎簡介 14
2.1 概述 14
2.2 字對齊 15
2.3 翻譯模型 16
3 統計式自動翻譯系統的流程簡介 16
3.1 語料前處理 16
3.2 訓練階段 17
3.3 調參數優化階段 18
3.4 效能 18
4 統計式自動翻譯系統的瓶頸 18
4.1 語料前處理的句對齊和過濾錯誤語料 18
4.2 訓練後的優化和輸出 18
第三章 現有翻譯品質評測方法 19
1 簡介 19
2 人工評價 19
3 自動評價 20
3.1 BLEU (bilingual evaluation understudy) 20
3.2 METEOR ( An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments ) 22
4 分析 23
5 實驗 24
6 結論 27
第四章 基於深度學習的翻譯品質評測 28
1 基本原理 28
2 語料準備 29
3 實驗一:Sentence to vector 30
3.1 Sentence to vector 實驗結果分析 31
4 實驗二:直接 seq2seq 評測 32
4.1 實驗結果 32
5 實驗三:加入停頓字 _NULL 32
5.1 實驗結果 33
6 實驗四: NULL *2 34
6.1 實驗結果 34
7 實驗五:進度條模式 35
7.1 實驗結果 35
第五章 互動式翻譯 36
1 操作介面 36
2 互動翻譯運作流程 38
3 推薦最佳 3 句 39
第六章 領域翻譯 40
1 訓練流程 40
2 翻譯流程 40
第七章 功能完備的翻譯平台 41
1 系統架構 42
2 功能 42
2.1 主流程 42
2.2 標記 42
2.3 過濾功能 43
3 分析與結論 43
第八章 結論與建議 44
4 研究發現與貢獻 44
5 研究限制 44
6 後續研究方向 44
參考文獻 45
附錄 46

1. Weaver, Warren. "Translation." Machine translation of languages 14 (1955): 15-23.
2. P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer (1993). The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2), 263-311.
3. Papineni, Kishore, et al. "BLEU: a method for automatic evaluation of machine translation." Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002.
4. Banerjee, Satanjeev, and Alon Lavie. "METEOR: An automatic metric for MT evaluation with improved correlation with human judgments." Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. Vol. 29. 2005.
5. Pouliquen, Bruno, et al. "Large-scale multiple language translation accelerator at the United Nations." Proc. of MT Summit. 2013.
6. Tian, Liang, et al. "UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation." LREC. 2014.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top