跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.152) 您好!臺灣時間:2025/11/04 04:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:簡瑋男
研究生(外文):Wei-Nan Chien
論文名稱:應用獨立成份分析於同義詞替換之研究
論文名稱(外文):Chinese Near-Synonym Substitution Using Independent Component Analysis
指導教授:禹良治禹良治引用關係
學位類別:碩士
校院名稱:元智大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:中文
論文頁數:30
中文關鍵詞:同義詞獨立成份分析資訊檢索
外文關鍵詞:SynonymIndependent Component Analysisinformation retrieval
相關次數:
  • 被引用被引用:0
  • 點閱點閱:387
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:3
同義詞替換在自然語言學習領域中的資訊檢索(information retrieval , IR)與電腦輔助語言學習的研究上,是經常被提出來討論的有趣問題。如何將正確的同義詞填入句子中適當的位置,往往令許多第二語言的語言學習者混淆,因為許多詞語雖然意義相近或相同,但在使用上的習慣或方式卻可能有極大的差異。早期的研究已提出多種方法在同義詞選擇的應用上,但主要都是基於以英文為主的文本上研究,相同的方法若是應用於中文文本,可能會產生不同的結果;本研究提出了獨立成份分析(independent component analysis, ICA)方法應用在中文的同義詞替換研究上,其結果也顯示出與早期研究提出的方法:pointwise mutual information (PMI),5連詞語言模型以及空間向量模型(vector space model, VSM)做比較後,ICA確實能夠得到較高的同義詞替換正確率。

Near-synonym sets represent groups of words with similar meaning, which are useful knowledge resources for many natural language applications such as query expansion for information retrieval (IR) and computer-assisted language learning. However, near-synonyms are not necessarily interchangeable in contexts due to their specific usage and syntactic constraints. Previous studies have developed various methods for near-synonym choice in English sentences. To our best knowledge, there is no such evaluation on Chinese sentences. Therefore, this paper proposes the use of the independent component analysis (ICA) for Chinese near-synonym choice evaluation. Experimental results show that the ICA achieves higher accuracy than the pointwise mutual information (PMI), 5-gram language model and vector space model (VSM) that have been used in previous studies.

書名頁 i
論文口試委員審定書 ii
授權書 iii
中文摘要 iv
英文摘要 v
誌謝 vi
目錄 vii
表目錄 viii
圖目錄 ix
第一章 緒論 1
1.1 研究背景 1
1.2 動機與目的 2
1.3 章節概要 3
第二章 文獻探討 4
2.1 N連詞(N-GRAM) 4
2.2 POINTWISE MUTUAL INFORMATION (PMI) 5
2.3 向量空間模型(VECTOR SPACE MODEL, VSM) 8
第三章 研究方法 12
3.1 系統流程架構 12
3.2 資料前處理 13
3.3 ICA模型 14
3.4 支援向量機 17
第四章 實驗結果與分析 18
4.1 實驗設計 18
4.1.1 同義詞集合 18
4.1.2 訓練與測試資料 18
4.1.3 方法實作 19
4.2 ICA參數調整 20
4.3 實驗結果 22
4.4 ICA分析 24
第五章 結論與未來展望 28
參考文獻 29


1.D. Inkpen, “Near-Synonym Choice in an Intelligent Thesaurus,” Proc. Association for Computational Linguistics, 2007, pp. 356–363.
2.D. Inkpen, “A statistical model for near-synonym choice,” ACM Transactions on Speech and Language Processing, vol. 4, no. 1, 2007, pp. 1-17.
3.P. Edmonds, “Choosing the word most typical in context using a lexical co-occurrence network,” Proc. Association for Computational Linguistics 1997, pp. 507-509.
4.A. Islam and D. Inkpen, “Near-Synonym Choice using a 5-gram Language Model,” Proc. Research in Computing Science 2010, pp. 41-52.
5.L.-C. Yu, et al., “Annotation and verification of sense pools in OntoNotes,” Information Processing & Management, vol. 46, no. 4, 2010, pp. 436-447.
6.M. Gardiner and M. Dras, “Exploring Approaches to Discriminating among Near-Synonyms,” Proc. Australasian Language Technology Workshop, 2007, pp. pages 31-39.
7.K.W. Church and P. Hanks, “Word Association Norms Mutual Information and Lexicography,” Computational Linguistics, vol. 16(1), 1991, pp. 22-29.
8.T. Wang and G. Hirst, “Near-synonym Lexical Choice in Latent Semantic Space,” Proc. International Conference on Computational Linguistics, 2010, pp. 1182–1190.
9.G. Salton and M.J. McGill, “introduction to modern information retrieval,” McGraw-Hill Book Company, 1983, pp. 201-215.
10.R.A. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA ©1999 1999.
11.C.C. Lin, “使用向量空間模型於電腦輔助同義詞學習之研究,” 元智大學資訊管理研究所碩士論文, 2010.
12.A. Hyvärinen and E. Oja, “Independent Component Analysis:Algorithms and Applications,” Neural Networks, 2000, pp. 411-430.
13.T. Honkela and A. Hyv¨arinen, “Linguistic Feature Extraction using Independent Component Analysis,” Proc. Neural Networks, 2004. Proceedings, 2004
14.T. Honkela, et al., “Emergence of Linguistic Representations by Independent Component Analysis,” Proc. Computer and Information Science, 2003.
15.R. Rapp, “Mining Text for Word Senses Using Independent Component Analysis,” Proc. SIAM International Conference 2004.
16.Q. Pu and G.-W. Yang, “Short-Text Classification Based on ICA and LSA,” Proc. Advances in Neural Networks-ISNN 2006, pp. 265 – 270.
17.X. Sevillano, et al., “ICA-Based hierarchical test classification for multi-domain test-to-speech synthesis,” Proc. IEEE International Conference on Acoustics Speech and Signal Processing, 2000, pp. 697-700.
18.X. Sevillano, et al., “Reliability in ICA-based Text Classication,” Proc. Lecture Notes in Computer Science, 2004.
19.N. Wu and J. Zhang, “Factor-analysis based anomaly detection and clustering,” Decision Support Systems, vol. 42, no. 1, 2006, pp. 375-389.


電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top