跳到主要內容

臺灣博碩士論文加值系統

(44.210.149.205) 您好!臺灣時間:2024/04/16 18:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:曹乃龍
研究生(外文):Nai-Lung Tsao
論文名稱:在稀少的學習資訊情況下語意辨正之研究
論文名稱(外文):On the Study of the Sparseness Problem for Word Sense Disambiguation
指導教授:郭經華郭經華引用關係
指導教授(外文):Chin-Hwa Kuo
學位類別:博士
校院名稱:淡江大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:英文
論文頁數:61
中文關鍵詞:語意辨正相似度計算自助式學習
外文關鍵詞:Word Sense DisambiguationSimilarity MeasurementBootstrapping
相關次數:
  • 被引用被引用:0
  • 點閱點閱:232
  • 評分評分:
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:1
Word Sense Disambiguation (WSD)可依據上下文分辨多義字的正確字義,基本上WSD可視為一種分類問題。然而可供訓練資料的不足,會嚴重影響WSD的正確性及可辨識率。本論文提出兩種策略,分別適用於訓練階段及測試階段,以解決學習資訊稀少的問題。
在訓練階段,本論文提出一新的二階段半監督式學習法。此方法使用一詞庫及一標準語料庫作為學習素材,第一階段使用了兩個假設:(1)詞庫中有關聯的字是相似的,及(2)相似的字在語料庫中會有相似的上下文,運用這兩個假設,可建立一初始的字義辨識器。第二階段採用Yarowsky提出的自助學習法,運用標準語料庫及第一階段所產生的字義辨識器,產生更精準的字義辨識器。
至於在測試階段,本論文也提出了一新的平滑演算法,以期解決於測試資料並未出現在訓練資料的問題。由於本論文的分類法採用的是貝式分類器,故上述問題將使分類器辨識失敗。與傳統使用機率為基礎的平滑演算法不同,本論文提出的方法是以語意關聯為基礎,藉以將未知的測試資料轉成最相近的訓練資料以供分類。
由實驗結果可看出,我們於訓練階段所提的分類演算法可得到很好的效能,再利用自助式學習,提高字義辨識器的正確率,此時,字義辨識器只能辨識一半左右的字。而在加入語意平滑演算法後,字義辨識器在不嚴重損害正確率的前提下,已可辨識將近90%的多義字,如此,這個字義辨識器就能有效的運用在其他應用程式當中。
Word sense disambiguation (WSD) is the task of identifying the meaning of a word in a specific context. Many previous works have been proposed using existing dictionaries, thesauri and corpora to construct sense identifiers. One of the most serious obstacles in research on WSD is sparseness of training data.
In this thesis, we illustrate that sparseness problem can be resolved by using two stages of sense identifier construction: training and testing stages. In the training process, preventive strategies are used to expand the contextual features; in testing process, post-hoc strategies are used to produce the contribution value for those contextual features encountered in the testing samples which do not occur in training samples.
We present a novel preventive strategy by a two-stage semi-supervised learning approach. This method is used to construct our sense identifier from an unlabeled corpus and a well-compiled thesaurus, WordNet. The first stage of our method adopts two concepts: (1) semantic relatives are similar and (2) similar words provide similar contextual clues. Using these concepts, we construct an initial word sense identifier for unsupervised training on unrestricted text. The experimental results show that the first stage achieves significant improvements in precision rate without sacrificing recall rates. However, the recall and precision rate of this initial sense identifier are unacceptable for real-world applications. The second stage of our method modifies Yarowsky’s famous bootstrapping approaches for WSD. By using large unrestricted text, both the recall and precision rate show a significant improvement.
Even though the resources for unlabeled text can be unlimited, such as from the WWW, the occurrences of the target word encountered in testing data still possibly lack the contextual features that training process extracted as able to identify the sense of the target word. Post-hoc strategies are required to eliminate this problem. We propose a similarity measurement which is determined by a graph-theoretic computing based on WordNet structure. In the experiments, we evaluate two kinds of post-hoc strategies. The experimental results show that similarity measurement approaches have better precision and recall rate than smoothing approaches and our proposed method works better than other similarity measurement approaches in WSD. It is also shown that the preventive and post-hoc strategies can effectively minimize, or even eliminate the sparseness problem.
Table of Contents
List of Tables III
List of Figures IV
Chapter 1 Introduction 1
1.1 WSD overview 1
1.2 The sparseness problem 2
1.2.1 Identify sparseness problem 2
1.2.2 Resolving Strategies 4
1.2.2.1 Preventive strategies 5
1.2.2.2 Post-hoc strategies 5
1.3 Research contribution 5
1.4 Chapter summaries 6
Chapter 2 Related Works 8
2.1 Word Sense Identifier Construction 8
2.1.1 Supervised learning 9
2.1.2 Semi-supervised learning 11
2.1.3 Unsupervised learning 12
2.2 Resolving sparseness problem 13
2.2.1 Context window 13
2.2.2 Morphology analysis 14
2.2.3 Smoothing 14
2.2.4 Similarity measurement 16
Chapter 3 Two-stage semi-supervised learning approaches 18
3.1 The basic concepts 18
3.1.1 Relatives and similar words providing similar contextual clues 18
3.1.2 Bootstrapping 21
3.1.3 Rough Approaches 26
3.2 Our novel approach 27
3.2.1 WordNet 28
3.2.2 Naïve Bayes Classifier 30
3.2.3 Details of the proposed approach 31
Chapter 4 Similarity measurement 35
4.1 Related works 35
4.2 Similarity measurement 36
Chapter 5 Experimental results 38
5.1 Proposed approaches summarization 38
5.2 Training and testing data 42
5.3 Feature extraction 43
5.4 Stage 1 experiments: extracting contextual features 43
5.5 Experiments for bootstapping approaches at the training stage 46
5.6 Experiments for two similarity measurements and smoothing at the testing stage 50
Chapter 6 Conclusion 54
Reference 57
Reference
[1] Adami, Giordano, Paolo Avesani and Diego Sona. 2003. “Bootstrapping for Hierarchical Document Classification,” in Proceedings of the international conference on Information and knowledge management.
[2] Agirre, Eneko and David Martinez. 2004a. “Smoothing and Word Sense Disambiguation,” in Proceedings of ESPANA for Natual Language Processing, Alicante, Spain, p360-371.
[3] Agirre, Eneko and David Martinez. 2004b. “Unsupervised WSD Based on Automatically retrieved examples: The Importance of Bias,” in Proceedings on the Conference on Empirical Methods in Natural Language Processing.
[4] Brants, Thorsten. 2000. “TnT — A Statistical Part-of-Speech Tagger,” in Proceedings of the Sixth Applied Natural Language Processing Conference ANLP-2000, Seatle, WA.
[5] Crouch, Carolyn J.. 1998. “A Cluster-Based Approach to Thesaurus Construction,” in Proceedings of ACM SIGIR, Grenoble, France.
[6] Edmonds, Phil and Scott Cotton. 2001. “Senseval-2: Overview,” in Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France.
[7] Feng, Huamin, Rui Chi and Tat-Seng Chua. 2004. “A Bootstrapping Framework for Annotating and Retrieving WWW images,” in Proceedings of the ACM international conference on Multimedia.
[8] Fernandez-Amoros, David. 2004. “WSD Based on Mutual Information and Syntactic Patterns,” in Proceedings of 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain.
[9] Gale, William A., Kenneth W. Church. 1994. “ What’s wrong with adding one?,” in N. Oostdijk and P. de Haan, editors, Corpus based Research into Language. Rodopi.
[10] Gale, William A., Kenneth W. Church and David Yarowsky. 1993. “A Method for Disambiguating Word Senses in a Large Corpus,” in Computers and the Humanities, 26, pp415-439.
[11] Ide, Nancy and Jean Veronis. “Introduction to the special issue on word sense disambiguation: The state of the art,” In Computational Linguistics 24(1): 1-40, 1998.
[12] Karov, Yael and Shimon Edelman. 1998. “Similarity-based Word Sense Disambiguation,” in Computational Linguistics 24(1): 41-59, 1998.
[13] Kilgarriff, Adam and Joseph Rosenzweig. 2000a. “English SENSEVAL: Report and Results,” in Proceedings of 2nd International Conference on Language Resources & Evaluation.
[14] Kilgarriff, Adam and Joseph Rosenzweig. 2000b. “Framework and Results for English Senseval,” in Computers and the Humanities Vol 34, pp15-48.
[15] Kuo, Chin-Hwa,Tzu-Chuan Chou, Nai-Lung Tsao, and Yung-Shiao Lan, 2003, “CanFind — A Semantic Image Indexing and Retrieval System,” IEEE/ISCAS
[16] Landes, Shari, Claudia Leacock and Randee I. Tengi. 1998. “Building Semantic Concordance,” in WordNet, An Electronic Lexical Database, edited by Christiane Fellbaum, The MIT Press, Cambridge, Massachusetts, London, England.
[17] Leacock, Claudia and Martin Chodorow. 1998. “Combining Local Context and WordNet Similarity for Word Sense Identification,” in WORDNET, AN ELECTRONIC LEXICAL DATABASE, edited by Christiane Fellbaum, The MIT Press, Cambridge, Massachusetts, London, England.
[18] Leacock, Claudia, Martin Chodorow and George A. Miller. 1998. “Using Corpus Statistics and WordNet Relations for Sense Identification,” in Computational Linguistics 24(1): 147-165, 1998.
[19] Lee, Yoong Keok and Hwee Tou Ng. 2002. “An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Philadelphia, pp 41-48.
[20] Lesk, Michael. 1986. “Automatic sense disambiguation: How to tell a pine cone from an ice cream cone,” in Proceedings of the 1986 SIGDOC Conference, pp.24-26, New York, ACM.
[21] Manning, Christopher D. and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing, The MIT Press, Cambridge, Massachusetts, London, England.
[22] Martinez, David and Eneko Agirre. 2004. “The effect of bias on the automatically-built word sense corpus,” in Proceedings of 4th International Conference on Language Resources & Evaluation.
[23] McCallum, Andrew, Ronald Rosenfeld, Tom M. Mitchell, Andrew Y. Ng. 1998. “Improving Text Classification by Shrinkage in a Hierarchy of Classes,” in Proceedings of the Fifteenth International Conference on Machine Learning, p.359-367.
[24] Melamed, I. Dan and Philip Resnik. 2000. “Tagger Evaluation Given Hierarchical Sets,” in Computers and the Humanities Vol. 34, pp 79-84.
[25] Mihalcea, Rada and Dan I. Moldovan. 1999. “A Method for Word Sense Disambiguation of Unrestricted Text,” in Proceedings of 37th Annual meeting of the Association for Computational Linguistics.
[26] Miller, George A., editor. 1990. “WordNet: An On-line Lexical Database,” in the International Journal of Lexicography, Vol 3(4), Oxford University Press.
[27] Ng, Hwee Tou and Hian Beng Lee. 1997. Defence Science Organization Corpus of Sense-Tagged English. http://ldc.upenn.edu/LDC97T12.htm.
[28] Oh, Jong-Hoon and Key-Sun Choi. 2002. “Word Sense Disambiguation using Static and Dynamic Sense Vectors,” in Proceedings of The 18th International Conference on Computational Linguistics, Taipei, Taiwan.
[29] Perdersen, Ted. 2002. “Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of Senseval-2,” in Proceedings of the Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, Philadelphia.
[30] Schutze, Hinrich. 1998. “Automatic Word Sense Discrimination,” In Computational Linguistics 24(1): 97-123, 1998.
[31] Schutze, Hinrich and Jan Perdersen. 1995. “Information Retrieval Based on Word Senses,” in Proceedings of SDAIR’95, Las Vegas, Nevada.
[32] Shinnou, Hiroyuki and Minoru Sasaki. 2003. “Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm”, in Proceedings of the Conference on Natural Language Learning.
[33] Stokoe, Christorpher and Michael P. Oakes. 2003. “Word Sense Disambiguation in Information Retrieval Revisited,” in Proceedings of the 26th annual International ACM SIGIR, Toronto, Canada.
[34] Toutanova, Kristina, Francine Chen, Kris Popat and Thomas Hofmann. 2001. “Text classification in a hierarchical mixture model for small training sets,” in Proceedings of the tenth international conference on Information and knowledge management, Atlanta, Georgia, USA, pp105-113.
[35] Tsao, Nai-Lung, David Wible, and Chin-Hwa Kuo. 2003. “Feature Expansion for Word Sense Disambiguation,” IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), Beijing, China.
[36] Veronis, Jean; Nancy M. Ide. 1990. “Word Sense Disambiguation with Very Large Neural Networks Extracted from Machine Readable Dictionaries,” in Proceedings of The 13th International Conference on Computational Linguistics.
[37] Wible, David, Chin-Hwa Kuo and Nai-Lung Tsao. 2004. “Contextual Language Learning in the Digital Wild: Tools and a Framework,” IEEE International Conference on Advanced Learning Technologies (ICALT), Joensuu, Finland.
[38] Wilks, Yorick A., Brain A. Slator and Louise M. Guthrie. 1996. Electric Words: Dictionaries, Computers and Meanings. A Bradford Book. MIT Press, Cambridge, MA.
[39] Wu, Dekai, Weifeng Su and Marine Carpuat. 2004. “A Kernel PCA Method for Superior Word Sense Disambiguation,” in Proceedings of 42nd Annual meeting of the Association for Computational Linguistics, Barcelona, Spain.
[40] Yarowsky, David. 1992. “Word Sense Disambiguation using Statistical Models of Roget’s Categories trained on Large Corpora,” in COLING 14, pp. 454-460.
[41] Yarowsky, David. 1993. “One Sense Per Collocation,” in Proceedings of APRA Human Language Technology Workshop, Princeton.
[42] Yarowsky, David. 1994. “Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French,” in Proceedings of the 32nd Annual meeting of the Association for Computational Linguistics, Las Cruces, NM.
[43] Yarowsky, David. 1995. “Unsupervised word sense disambiguation rivaling supervised methods,” in ACL 33, pp 189-196.
[44] Yaroswky, David. 2002. “Evaluating Sense Disambiguation across diverse parameter spaces,” Natural Language Engineering, 8(4):293-310.
[45] Yang, Jun, Liu Wenyin, Hongjiang Zhang and Yueting Zhuang. 2001. “Thesaurus-Aided Approach for Image Browsing and Retrieval,” in the Proceedings of the IEEE International Conference on Multimedia and Expo, pp313-316.
[46] Yuret, Deniz. 2004. “Some Experiments with a Naïve Bayes WSD System,” in Proceedings of 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top