(3.230.154.160) 您好!臺灣時間:2021/05/07 18:25
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:葉智豪
研究生(外文):Chih-Hao Yeh
論文名稱:應用馬可夫邏輯之集體推論式括號翻譯擷取
論文名稱(外文):Collective Parenthetical Translations Extraction using Markov Logic
指導教授:張貿翔張貿翔引用關係許聞廉許聞廉引用關係
指導教授(外文):Maw-Shang ChangWen-Lian Hsu
口試委員:張貿翔許聞廉蔡志忠蔡宗翰
口試委員(外文):Maw-Shang ChangWen-Lian HsuChih-Chung TsaiTzong-Han Tsai
口試日期:2011-07-06
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:英文
論文頁數:50
中文關鍵詞:括號翻譯括號翻譯擷取一階邏輯馬可夫邏輯網路
外文關鍵詞:Parenthetical translationParenthetical translations extractionMarkov logic network
相關次數:
  • 被引用被引用:0
  • 點閱點閱:293
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
在自然語言文本中,許多詞彙會以括號夾帶其另一種語言的翻譯一同出現,此種呈現雙語翻譯配對的形式稱為括號翻譯。括號翻譯經常被用來呈現雙語詞典所涵蓋不到的新生詞彙,利用括號標註新生詞彙的翻譯,增加其可讀性。因此,透過自然語言文本進行括號翻譯的擷取,是一個可以有效收集新生詞彙雙語翻譯配對的途徑。然而,括號翻譯擷取的一大困難點在於界定括號前方的被翻譯詞邊界,由於許多東亞語言如中文,詞彙之間並沒有一個明顯的斷點可以區分邊界,因此難以界定出括號前的被翻譯詞範圍。在本篇論文的研究中,我們提出了一個應用馬可夫邏輯的集體推論式括號翻譯擷取方法,將所有觀察到的問題特徵一階邏輯化並建立馬可夫邏輯網路進行集體推論。在我們的實驗中,我們提出的集體推論式括號翻譯擷取方法比起現今基於雙語文字匹配的方法,在F-measure的結果中平均提升了多達31.9%的效能,證明了方法的有效性。
Parenthetical translation is a form of bilingual translation pair, one translation inside a pair of parenthesis and another followed by the left parenthesis. Parenthetical translations extraction (PTE) is the task of extracting parenthetical translations from natural language documents. It is motivated by many new words accompanied by their translations in the form of parenthesis. One of the main difficulties in PTE is to detect left boundary of translated term in pre-parenthesis text. In this paper, we propose a collective approach that employs the Markov logic to model multiple constraints used in the PTE task. We show how various constraints can be formulated and combined in a Markov logic network (MLN). Our experimental results showed that, the propose collective PTE approach significantly outperforms a current state-of-the-art method, it improved average F-measure up to 31.9% compared to the previous word alignment approach.
誌謝 i
摘要 iii
Abstract iv
1. Introduction 1
1.1 Motivation 1
1.2 Problem Statement 2
1.3 Our Work 4
2. Related Work 6
2.1 Cross-Language Information Retrieval 6
2.2 Automatic Extraction of Bilingual Translations 8
2.3 Parenthetical Translations Extraction 9
3. Markov Logic 14
3.1 First-Order Logic 14
3.2 Markov Networks 16
3.3 Markov Logic Networks 17
3.3.1 Probabilistic Inference 20
3.3.2 Discriminative Weight Learning 22
3.3.3 The Alchemy System 23
4. An MLN for Collective Parenthetical Translations Extraction 24
4.1 The Base MLN 25
4.2 Word Alignment Features 25
4.2.1 Basic Word Alignment 26
4.2.2 Prefix Word Alignment 27
4.3 Punctuation Marks Features 29
4.4 Co-occurrence Substring Features 30
5. Experiments and Analysis 37
5.1 Dataset 37
5.2 Experiment Design 39
5.3 Evaluation Metrics 40
5.4 Results 41
6. Conclusion 44
7. References 46

[1]G. Cao, J. Gao, and J.-Y. Nie, "A system to Mine Large-Scale Bilingual Dictionaries from Monolingual Web Pages," presented at the In MT Summit XI, Copenhagen, Denmark, 2007.
[2]Y. Chen and C. Zong, "A Structure-Based Model for Chinese Organization Name Translation," ACM Transactions on Asian Language Information Processing vol. 7, pp. 1-30, 2008.
[3]M. Diab and S. Finch, "A Statistical Word-Level Translation Model for Comparable Corpora," in Proc. of Conference on Content-based Multimedia Information Access, 2000.
[4]W. A. Gale and K. W. Church, "Identifying word correspondence in parallel texts," presented at the Proceedings of the workshop on Speech and Natural Language, Pacific Grove, California, 1991.
[5]M. R. Genesereth and N. J. Nilsson, Logical foundations of artificial intelligence: Morgan Kaufmann, 1987.
[6]W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov chain Monte Carlo in practice: Chapman and Hall, 1996.
[7]J. Gonzalo, F. Verdejo, C. Peters, and N. Calzolari, "Applying EuroWordNet to Cross-Language Text Retrieval," Computers and the Humanities, vol. 32, pp. 185-207, 1998.
[8]F. Huang, S. Vogel, and A. Waibel, "Extracting named entity translingual equivalence with limited resources," vol. 2, pp. 124-129, 2003.
[9]K. L. Kwok, P. Deng, N. Dinstl, H. L. Sun, W. Xu, P. Peng, and J. Doyon, "CHINET: a Chinese name finder system for document triage," in In Proceedings of 2005 International Conference on Intelligence Analysis, 2005.
[10]V. Lavrenko, M. Choquette, and W. B. Croft, "Cross-lingual relevance models," presented at the Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, Tampere, Finland, 2002.
[11]C.-J. Lee, J. S. Chang, and J.-S. R. Jang, "Extraction of transliteration pairs from parallel corpora using a statistical transliteration model," Information Sciences, vol. 176, pp. 67-90, 2006.
[12]D. Lin, S. Zhao, B. V. Durme, and M. Pa\csca, "Mining Parenthetical Translations from the Web by Word Alignment," in Proceedings of ACL-08: HLT, ed. Columbus, Ohio: Association for Computational Linguistics, 2008, pp. 994-1002.
[13]D. Lowd and P. Domingos, "Efficient Weight Learning for Markov Logic Networks," in Knowledge Discovery in Databases: PKDD 2007. vol. 4702, J. Kok, J. Koronacki, R. Lopez de Mantaras, S. Matwin, D. Mladenic, and A. Skowron, Eds., ed: Springer Berlin / Heidelberg, 2007, pp. 200-211.
[14]W.-H. Lu, L.-F. Chien, and H.-J. Lee, "Translation of web queries using anchor text mining," vol. 1, pp. 159-172, 2002.
[15]M. F. Møller, "A scaled conjugate gradient algorithm for fast supervised learning," Neural Networks, vol. 6, pp. 525-533, 1993.
[16]U. Manber and G. Myers, "Suffix arrays: a new method for on-line string searches," presented at the Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, San Francisco, California, United States, 1990.
[17]I. D. Melamed, "Models of translational equivalence among words," Comput. Linguist., vol. 26, pp. 221-249, 2000.
[18]M. Nagata, T. Saito, and K. Suzuki, "Using the web as a bilingual dictionary," presented at the Proceedings of the workshop on Data-driven methods in machine translation - Volume 14, Toulouse, France, 2001.
[19]D. Oard, "A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval," in Machine Translation and the Information Soup. vol. 1529, D. Farwell, L. Gerber, and E. Hovy, Eds., ed: Springer Berlin / Heidelberg, 1998, pp. 472-483.
[20]N. Okazaki, S. Ananiadou, and J. i. Tsujii, "A discriminative alignment model for abbreviation recognition," presented at the Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, Manchester, United Kingdom, 2008.
[21]J. Pearl, Probabilistic reasoning in intelligent systems: networks of plausible inference: Morgan Kaufmann, 1988.
[22]H. Poon and P. Domingos, "Sound and efficient inference with probabilistic and deterministic dependencies," presented at the Proceedings of the 21st national conference on Artificial intelligence - Volume 1, Boston, Massachusetts, 2006.
[23]H. Poon and P. Domingos, "Joint inference in information extraction," presented at the Proceedings of the 22nd national conference on Artificial intelligence - Volume 1, Vancouver, British Columbia, Canada, 2007.
[24]M. Richardson and P. Domingos, "Markov Logic Networks," Machine Learning, vol. 62, pp. 107-136, 2006.
[25]D. Roth, "On the hardness of approximate reasoning," Artificial Intelligence, vol. 82, pp. 273-302, 1996.
[26]I. A. Sag, T. Baldwin, F. Bond, A. A. Copestake, and D. Flickinger, "Multiword Expressions: A Pain in the Neck for NLP," presented at the Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing, 2002.
[27]L. Shao and H. T. Ng, "Mining new word translations from comparable corpora," presented at the Proceedings of the 20th international conference on Computational Linguistics, Geneva, Switzerland, 2004.
[28]M. P. Wellman, J. S. Breese, and R. P. Goldman, "From knowledge bases to decision models," The Knowledge Engineering Review, vol. 7, pp. 35-53, 1992.
[29]X. Wu, N. Okazaki, and J. i. Tsujii, "Semi-supervised lexicon mining from parenthetical expressions in monolingual web pages," presented at the Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado, 2009.
[30]F. Yang, J. Zhao, and K. Liu, "A Chinese-English organization name translation system using heuristic web mining and asymmetric alignment," presented at the Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1, Suntec, Singapore, 2009.
[31]L. Yang, D. Ji, and M. Leong, "Document reranking by term distribution and maximal marginal relevance for chinese information retrieval," Information Processing & Management, vol. 43, pp. 315-326, 2007.
[32]X. Yu and W. Lam, "An integrated probabilistic and logic approach to encyclopedia relation extraction with multiple features," presented at the Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, Manchester, United Kingdom, 2008.
[33]D. Zhou, M. Truran, T. Brailsford, and H. Ashman, "NTCIR-6 experiments using pattern matched translation extraction," in Proceedings of NTCIR-6 Workshop Meeting, 2006.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
系統版面圖檔 系統版面圖檔