跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.223) 您好!臺灣時間:2025/10/08 02:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳軒正
研究生(外文):Shiuan-Jeng Chen
論文名稱:以SentiWordNet為基礎建構具領域特性之情感詞彙庫
論文名稱(外文):Building a domain-oriented sentiment lexicon based on SentiWordNet
指導教授:洪智力洪智力引用關係
指導教授(外文):Chih-Li Hung
學位類別:碩士
校院名稱:中原大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:62
中文關鍵詞:情感分析SentiWordNet字義辨識
外文關鍵詞:Sentiment AnalysisSentiWordNetWord Sense Disambiguation
相關次數:
  • 被引用被引用:1
  • 點閱點閱:1738
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
隨著Web 2.0的發展,近年來情感分析也成為相當熱門的研究議題。情感分析的目的即是以自動化的方式獲取電子文本中文字所隱含的情感態度,進而辨識電子文本所欲表達的情感傾向,其中,SentiWordNet即為情感分析中相當重要的情感詞彙資源。SentiWordNet是以WordNet為基礎所發展出的情感詞彙庫,該詞彙庫會賦予WordNet中的每個同義詞集三組分別代表正向、負向與中立的情感極性分數,而情感分析的進行即可透過辨識該詞彙庫所賦予詞彙的情感分數進而對電子文本進行分類。然而SentiWordNet雖可幫助情感分析進行,但仍存在字義辨識的缺點,由於詞彙普遍存在一字多義之問題,在情感分析進行中如何挑選詞彙正確之字義也將影響情感分析成效,而過去使用SentiWordNet進行情感分析的相關研究中,少有學者探討字義辨識問題對情感分析結果的影響。因此,本研究即以SentiWordNet為基礎,透過相關字義辨識方法建置具有領域特性之情感子詞彙庫,用以幫助情感分析中多義詞字義的選擇,並藉由改善SentiWordNet的字義辨識問題,進而提升情感分析的成效。實驗結果證實,相較於使用SentiWordNet進行情感分析,使用本研究改善字義辨識問題後所建置的具領域特性情感詞彙庫,確實能夠提升其分類準確性。


With the development of Web 2.0, sentiment analysis has become a popular research topic in recent years. The goal of sentiment analysis is using an automated method to get implicit sentiment attitudes and correctly identifies the articles to the corrective sentiment orientations. SentiWordNet is an important vocabulary resource in the sentiment analysis. SentiWordnet is based on WordNet. SentiWordNet gives each synset positive, negative, objective sentiment scores. Sentiment analysis can classify the digital text by sentiment score which the synset of SentiWordNet gives. Although, SentiWordNet can help the process of sentiment analysis, it exists disadvantages of word sense disambiguation, which the sense of the word has multiple meanings and further to affect the result of sentiment analysis. In literature, few scholars discuss the problem in the academic field of sentiment analysis. This thesis builds a domain-oriented sentiment lexicon based on SentiWordNet to choice the sense of the word in sentiment analysis, improve word sense disambiguation of SentiWordNet and further to increase the accuracy of sentiment analysis.


目錄
摘要I
AbstractII
誌謝辭III
目錄IV
圖目錄VII
表目錄VIII
第一章、 緒論1
1.1 研究背景與動機1
1.2 研究問題3
1.3 研究目的3
1.4 研究範圍4
1.5 論文架構4
第二章、 文獻探討6
2.1 情感分析(Sentiment Analysis)6
2.1.1 主觀性分類(Subjectivity Classification)6
2.1.2 情感分類(Sentiment Classification)6
2.1.3 情感分析流程7
2.2 字義辨識(Word Sense Disambiguation)11
2.2.1 監督式學習法(Supervised Disambiguation)11
2.2.2 非監督式學習法(Unsupervised Disambiguation)12
2.2.3 知識為基礎方法(Knowledge-Based Disambiguation)12
2.3 WordNet詞彙庫13
2.4 SentiWordNet情感詞彙庫13
2.5 小結15
第三章、 研究方法16
3.1 研究架構16
3.2 文章資料預處理18
3.2.1 電子文本18
3.2.2 文章斷句19
3.2.3 詞性標記19
3.2.4 字詞還原20
3.2.5 去除無用字(StopWord)21
3.3 情感子詞彙庫建置模組22
3.3.1 特徵詞彙擷取22
3.3.2 詞彙字義領域性計算24
3.3.3 詞彙字義Sense排序28
3.3.4 門檻值篩選28
3.3.5 建立具領域特性情感子詞彙庫29
3.4 情感文章分類29
3.5 文章情感分類評估30
3.6 小結32
第四章、 實驗結果與評估33
4.1 實驗說明33
4.1.1 實驗資料集與驗證資料集33
4.1.2 特徵詞彙擷取33
4.1.3 建置情感子詞彙庫34
4.1.4 評估方式35
4.2 實驗結果36
4.2.1 實驗一36
4.2.2 實驗二41
第五章、 結論與未來展望44
5.1 研究結論44
5.2 研究貢獻45
5.3 未來研究方向45
參考文獻47
附錄 無用字與雜訊詞列表51

圖目錄
圖1-1研究流程圖 4
圖2-1情感分析流程圖 7
圖2-2 SVM示意圖11
圖3-1研究流程圖 17
圖3-2文章預處理流程圖18
圖3-3情感子詞彙庫建置模組流程圖 22
圖3-4詞彙字義與文章相似度計算流程圖 26
圖3-5建立情感子詞彙庫示意圖-以Suck為例 29
圖4-1實驗一Part1結果比較38
圖4-2實驗一Part2結果比較40

表目錄
表3-1 Brill Tagger詞性符號對照表19
表3-2詞性標註轉換對照表 20
表3-3形容詞詞性下之詞彙"bad"於SentiWordNet之Sense與例句節錄表25
表3-4文章情感向量模型30
表4-1情感子詞彙庫組合對照表 35
表4-2實驗一Part1數據資訊36
表4-3實驗一Part1分類結果37
表4-4實驗一Part2數據資訊39
表4-5實驗一Part2分類結果39
表4-6實驗二Part1分類結果41
表4-7實驗二Part1成對樣本T檢定42
表4-8實驗二Part2分類結果42
表4-9實驗二Part2成對樣本T檢定42

參考文獻
1.英文部分:
Agrawal, S., & Siddiqui, T. (2009, November). Using syntactic and contextual information for sentiment polarity analysis. In S. Sohn (Chair). 2nd International Conference on Interaction Sciences, Symposium conducted at the meeting of the Seoul, Korea.
Brill, E. (1992). A simple rule-based part of speech tagger. 3rd Applied Natural Language Processing Conference. Symposium conducted at the meeting of the Trento, Italy.
Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). SentiWordNet 3.0: An Enhanced lexical resource for Sentiment analysis and opinion mining. In J. E. Dolvik (Chair), 7th International Conference on Language Resources and Evaluation. Symposium conducted at the meeting of the Valletta, Malta.
Bird, S., Klein, E., & Loper, E. (Ed.). (2009). Natural Language Processing with Python. Sebastopol, CA: O'Reilly Media.
Banerjee, S., & Pedersen, T. (2002, February ). An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. In A. Gelbukh (Chair). Third International Conference on Intelligent Text Processing and Computational Linguistics. Symposium conducted at the meeting of the Mexico City, Mexico.
Collins, M. J. (1996, June ). A new statistical parser based on bigram lexical dependencies. In J. Moore(Chair). 34th annual meeting on Association for Computational Linguistics. Symposium conducted at the meeting of the Santa Cruz, USA.
Cai, K., Spangler, S., Chen, Y., & Zhang, L. (2010). Leveraging sentiment analysis for topic detection. Web Intelligence & Agent Systems, 8(3), 291-302.
Denecke, K. (2008, April). Using SentiWordNet for multilingual sentiment analysis. In U. Dayal(Chair). 24th International Conference on Data Engineering Workshop. Symposium conducted at the meeting of the Cancun, Mexico.
Esuli, A., & Sebastiani, F. (2006, May). SentiWordNet: A publicly available lexical resource for opinion mining. In N. Calzolari (Chair). 5th International Conference on Language Resources and Evaluation. Symposium conducted at the meeting of the Genoa, Italy.
Gale, W. A., Church, K. W., & Yarowsky, D. (1992). A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26, 415-439.
Godes, D., & Mayzlin, D. (2004). Using online conversations to study word-of-mouth communication. Marketing Science, 23(4), 545-560.
He, Y., & Zhou, D. (2011). Self-training from labeled features for sentiment analysis. Information Processing and Management,47 (2), 606-616.
Jeong, H., Shin, D., & Choi, J. (2011). FEROM: Feature extraction and refinement for opinion mining. Etri Journal, 33(5), 720-730.
Khan, K., Baharudin, B. B., Khan, A., & E-Malik, F. (2009, June). Mining opinion from text documents: A survey. In O. Kaynak (Chair). 3th IEEE International Conference on Digital Ecosystems and Technologies. Symposium conducted at the meeting of the Istanbul, Turkey.
Lesk, M. (1986). Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation, 24-26.
Laorden, C., Santos, I., Sanz, B., Alvarez, G., & Bringas, P. G. (2012). Word sense disambiguation for spam filtering. Electronic Commerce Research and Applications, 11, 290-298.
Lee, Y., Kim, J., & Lee, J. H. (2009, March). Extracting Domain-Dependent Semantic Orientations of Latent Variables for Sentiment Classification. In W. Li (Chair). 22th International Conference on the Computer Processing of Oriental Languages, Symposium conducted at the meeting of the Hong Kong, China.
Li, N., & Wu, D. D. (2010). Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decision Support Systems, 48(2), 354-368.
Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the Association for Computing Machinery, 38(11), 39-41.
McInnes, B. T. (2009). Supervised and Knowledge-based Methods for Disambiguating Terms in Biomedical Text using the UMLS and MetaMap (Unpublished doctoral dissertation). University of Minnesota, Minneapolis, Minnesota.
McCarthy, D. (2009). Word Sense Disambiguation: An Overview. Language and Linguistics Compass, 3(2), 537-558.
Navigli, R. (2009). Word sense disambiguation: A survey. ACM Computing Surveys, 41(2), 1-69.
Ohana, B. (2009). Opinion mining with the SentWordNet lexical resource (Unpublished doctoral dissertation). Dublin Institute of Technology, Dublin, Ireland.
O'leary, D. E. (2011). Blog mining-review and extensions: “From each according to his opinion”. Decision Support Systems, 51(4), 821-830.
Sreedhar, J., Raju, S. V., Babu, A. V., Shaik, A., & Kumar, P. P. (2012). Word Sense Disambiguation: An Empirical Survey. International Journal of Soft Computing and Engineering, 2(2),494-503.
Shi, L., Sun, B., Kong, L., & Zhang, Y. (2009, October). Web forum sentiment analysis based on topics. In D. Wei (Chair). 9th IEEE International Conference on Computer and Information Technology. Symposium conducted at the meeting of the Xiamen, China.
Tang, H., Tan, S., & Cheng, X. (2009). A survey on sentiment detection of reviews. Expert Systems with Applications, 36(7), 10760-10773.
Valitutti, A., Strapparava, C., & Stock, O. (2004). Developing affective lexical resources. PsychNology Journal, 2(1), 61-83.
Wiebe, J. M. (1994). Tracking point of view in narrative. Computational Linguistics, 20(2), 233-287.
Xu, K., Liao, S. S., Li, J., & Song, Y. (2011). Mining comparative opinions from customer reviews for competitive intelligence. Decision Support Systems, 50(4), 743-754.
Zhan, J., Loh, H. T., & Liu, Y. (2009). Gather customer concerns from online product reviews – A text summarization approach. Expert System with Application, 36(2), 2107-2115.
Zhao, Y. Y., Qin, B., & Liu, T. (2010). Sentiment Analysis. Journal of Software, 21(8), 1834-1848.
Zhang, C., Zeng, D., Li, J., Wang, F. Y., & Zuo, W. (2009). Sentiment analysis of Chinese documents: From sentence to document level. American Society for Information Science and Technology, 60(12), 2474-2487.

2.中文部分:
林晏僖、高照明、高成炎(2008)。中文名詞組的辨識:監督式與半監督式學習法的實驗,ROCLING 2008論文集)。
周立柱、賀宇凱、王建勇(2008)。情感分析研究綜述。計算機應用,28(11),2725-2728。
甯格致、賴昆棋(2010)。基於網路社群之旅遊經驗及對應情境之情感意見分析研究。ROCLING 2010論文集,184-198。

電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top