跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.88) 您好!臺灣時間:2024/12/04 13:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李世吉
研究生(外文):Shi-Chi Li
論文名稱:以概念為主的中文詞義辨識技術之研究
論文名稱(外文):A Stydy of Chinese Word Sense Disambiguation Based on Concept Primarily
指導教授:柯淑津柯淑津引用關係
指導教授(外文):Sue J. Ker
學位類別:碩士
校院名稱:東吳大學
系所名稱:資訊科學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
畢業學年度:95
語文別:中文
論文頁數:49
中文關鍵詞:概念特徵義原
外文關鍵詞:ConceptFeatureSeme
相關次數:
  • 被引用被引用:0
  • 點閱點閱:219
  • 評分評分:
  • 下載下載:30
  • 收藏至我的研究室書目清單書目收藏:0
詞義辨識(Word Sense Disambiguation)的目的在於決定一個多義詞之最佳詞義,在眾多的研究方式中,又以監督式學習法最為成功。監督式學習法需要人工標示語料庫,並從其中擷取資訊作為辨識多義詞之特徵。在選擇特徵方面,一般的作法會以共現的周邊詞彙作為特徵,但僅考量詞型會忽略詞彙在語言上的多樣性及一詞多義性,使得在收集特徵時無法集中資訊,或者造成特徵集合中帶有雜訊而影響辨識效果。因此,本研究在收集特徵時,先將周邊詞彙透過詞義區分字典轉換為該詞彙所代表的概念,再集合相同概念作為特徵,辨識目標詞之詞義。在實驗方面,我們以Senseval-3中文語料庫作為實驗語料庫,以知網知識庫作為詞義區分字典,並透過實驗結果驗證本研究之方式有不錯的辨識效能及很高的標示應用率(Applicability),也證明了先將周邊詞彙轉換為概念再收集為特徵,確實比較能集中資訊之假設。
The purpose of Word Sense Disambiguation (WSD) is to determine the best definition of polysemous words. Among the numerous researches, supervised learning is the most successful way. Under supervised learning, hand-tagged corpus is required in order to be extracted information as features of identifying the ambiguous word. In the aspect of selecting features, we choose co-occurrence context words as distinctions in general, but concern types of the words only will take no notice of the multiformity of our language and the polysemy of vocabularies. Under this circumstance, the information can’t be centralized or may lead to features with noise to affect the efficiency of WSD. Therefore, the collection method in this research is to transform context words into the concept of words first which are represented by sense division dictionary, then collect the same concept as the features to identify the meaning of the target word. In the aspect of experiment, we take Senseval-3 Chinese corpus as experimental language materials and HowNet as sense division dictionary. Afterward, we prove our research with good identifying efficiency and high applicability based on experiment results; also confirm the hypothesis which transforms context words to concept before collecting them as features is actually efficient to centralize the information.
誌謝..... i
中文摘要.....ii
英文摘要.....iii
目錄..... iv
表目錄.....v
圖目錄.....v
1. 緒論..... 1
2. 文獻探討....3
2.1 常用的詞彙資源.... 3
2.2 詞義辨識技術....4
2.2.1 監督式學習法....5
2.2.2 非監督式學習法.... 6
2.2.3 部分監督式學習法....7
2.3 中文詞義辨識....8
3. 研究資源....10
3.1 Senseval語料庫....10
3.1.1 Senseval會議....10
3.1.2 Senseval-3中文語料庫....11
3.2 知網知識庫....14
4. 研究方法....19
4.1 觀察....19
4.2 以共現上下文在知網定義之義原為標示特徵....21
4.2.1 收集詞義特徵....21
4.2.2 標示目標詞....25
5. 實驗....29
5.1 實驗資料....29
5.2 參數設定....31
5.3 實驗設計....32
5.4 實驗結果....33
5.5 分析與討論....37
6. 結論....41
參考文獻....42
附錄一....45
附錄二....48
1.Baeza-Yates, R. and B. Ribeiro-Neto, 1999, Modern Information Retrieval, The ACM Press, Addison Wesley Longman Limited, Essex, England.

2.Chung, Y.-J., K. Moon and J.-H. Lee, 2004, Conceptual Information-Based Sense Disambiguation, In Proceedings of the 1th International Joint Conference on Natural Language Processing, p. 348-357.

3.Florian, R. and D. Yarowsky, 2002, Classifier Combination for Word Sense Disambiguation, In Proceedings of the Empirical Methods in Natural Language Processing, p. 25-32.

4.Gliozzo, A., C. Giuliano and C. Strapparava, 2005, Domain kernels for word sense disambiguation, In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, p. 403–410.

5.Ide, N., J. Veronis and V. College, 1998, Introduction To The Special Issue On Word Sense Disambiguation: The State Of The Art, Computational Linguistics, Vol. 24, No. 1, p. 1-40.

6.Ker, S. J. and J.-N. Chen, 2000, A Text Categorization Based on Summarization Techiniques, In Proceedings of the ACL-2000 Workshop on Recent Advances in Natural Language Processing and Information Retrieval, p. 79-83.

7.Kohomban, U. S. and W. S. Lee, 2005, Learning Semantic Classes For Word Sense Disambiguation, In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, p. 34-41.

8.Lu, Z., H. Wang, J. Yao, T. Liu and S. Li, 2006, An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation, In Proceedings of the 43rd Annual Meeting of Association for Computational Linguistics, p. 457-464.

9.McCarthy, D., R. Koeling, J. Weeds and J. Carroll, 2004, Finding Predominant Word Senses in Untagged Text, In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, p. 279-286.

10.Navigli, R., 2006, Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance, In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, p. 105-112.

11.Niu, Z.-Y., D.-H. Ji and C.-L. Tan, 2004, Optimizing Feature Set for Chinese Word Sense Disambiguation, In Proceedings of Senseval-3: The Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, p. 191-194.

12.Niu, Z.-Y., D.-H. Ji, C.-L. Tan and L. Yang, 2005, Word Sense Disambiguation by Semi-Supervised Learning, In Proceedings of the 6th International Conference on Intelligent Text Processing and Computational Linguistics, p. 238-241.

13.Niu, Z.-Y., D.-H., Ji and C.-L. Tan, 2005, Word Sense Disambiguation Using Propagation Based Semi-Supervised Learning, In Proceedings of the 43rd Annual Meeting of Association for Computational Linguistics, p. 395-402.

14.Qun, L. and S. Li, 2002, Word Similarity Computing Based on How-net, Computational Linguistics and Chinese Language Processing, Vol. 7, No. 2, p. 59-76.

15.Wang, C.-Y., 2002, Knowledge-based Sense Pruning using the HowNet: An Alternative to Word Sense Disambiguation, Thesis of Hong Kong University of Science and Technology, Computer Science.

16.Wilks, Y. and M. Stevenson, 1996, The Grammar of Sense: Is Word Sense Tagging Much More than Part-of-Speech Tagging? In Sheffield Department of Computer Science, Research Memoranda, CS-96-05.

17.Yang, X. and T. Li, 2002, A Study of Semantic Disambiguation Based on HowNet, Computational Linguistics and Chinese Language Processing, Vol. 7, No. 1, p. 47-78.


18.Yang, D. and D. M. W. Powers, 2006, Word Sense Disambiguation using lexical cohesion in the context, In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, p. 929-936.

19.Zhou, D., O. Bousquet, T. N. Lal, J. Weston and B. Scholkopf, 2003, Learning with Local and Global Consistency, Advances in Neural Information Processing Systems 16, p. 321-328.

20.Penn Treebank corpus, ttp://www.cis.upenn.edu/~treebank.

21.Senseval語料庫,http://www.cs.unt.edu/~rada/senseval/.

22.Wordnet, http://wordnet.princeton.edu/.

23.知網,董振東、董強,HowNet,http://www.keenage.com/.

24.梅家駒、竺一鳴、高蘊琦和殷鴻翔,1993,同義詞詞林,台北:東華書局。
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文