(34.226.234.102) 您好!臺灣時間:2021/05/12 10:41
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:林琮凱
研究生(外文):Cong-Kai Lin
論文名稱:以分類學為基礎之跨領域情緒分析方法研究
論文名稱(外文):Taxonomy-based Cross Domain Sentiment Classification
指導教授:陳信希陳信希引用關係
指導教授(外文):Hsin-Hsi Chen
口試委員:張俊盛林川傑古倫維
口試委員(外文):Jyun-Sheng ChangChuan-Jie LinLun-Wei Ku
口試日期:2013-07-15
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:71
中文關鍵詞:跨領域情緒分類分類學結合式模型回歸模型遷移學習
外文關鍵詞:cross domain sentiment classificationtaxonomyensemble modelingregression modeltransfer learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:241
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
  在網際網路蓬勃發展的世代,人們經常在網路平台上,分享他們的生活經驗和對事物的看法,這對於有類似需求的人具有相當的參考作用。情緒分類(sentiment classification)的目的是運用過去眾人之經驗,預測文章的正面或負面情緒極性,具有多面向的實際應用,近年來受到人們極度的關注。然而,在進行情緒分類預測時,如果所擁有的資源與擬標記資料的所屬領域差異很大,例如運用『電器』領域之資料對『書籍』領域進行情緒分類之預測,分類的效能很容易就會劇烈地下降。這種來源領域與目標領域不同的情緒分類問題,稱之為跨領域情緒分類(cross domain sentiment classification)問題。
  近年來,跨領域情緒分類這個議題,有很多相關研究被提出來。過去的跨領域情緒分類研究,都將領域視為單一、不能分割的類別,這無法反映真實世界的情況。在許多線上購物網中,如亞馬遜、億貝(Amazon、eBay)等,對於商品的分類是以分類學為基礎(taxonomy-based)的樹狀分類方式呈現。在這篇論文中,我們提出有別於以往較粗領域的分類觀點,以樹狀分類架構為基礎,探討在相同領域以及跨領域的情緒分類問題。首先對於樹狀分類資料進行細膩分析,了解訓練資料的多樣性有助於跨領域情緒分類預測。接著運用這個概念,提出以分類學為基礎之模型組合演算法(taxonomy-based model combination, TBMC),參考樹狀架構調整模型的權重,將多個模型組合,用以解決跨領域情緒分類的問題。同時,我們也針對樹狀分類下的來源挑選問題,提出了以分類學為基礎之回歸模型(taxonomy-based regression model, TBRM),來幫助最佳來源節點之挑選。
  實驗結果顯示:TBMC方法對跨領域情緒分類之預測具有更佳的效能,在最佳來源的選擇議題,也反映TBRM方法比未運用樹狀資訊的回歸模型要來得優良。最後,我們也進一步結合這兩個方法,並有效地搭配遷移學習(transfer learning)以達到更好的效果。


In the Internet era, human are usually willing to share their experiences on different subjects. Those form very good references for the similar needs of human. Sentiment classification aims to employ the past experiences to predict the polarity of current documents. It attracts much attention in the recent years because of its various applications. One of the challenging issues in sentiment classification is: if the source and target domains of sentiment classification are different, for example, the use of knowledge in the electric domain to predict the polarity in the book domain, the classification performance may be decreased drastically. To deal with the sentiment classification between different domains are so called cross-domain sentiment classification.
In the recent years, many cross domain sentiment classification methods have been proposed. They consider a domain as a whole set of instances for training. However, many online shopping websites such as Amazon and eBay organize their data in terms of taxonomy. In this paper, we consider taxonomy as a basis to discuss the in-domain and cross-domain sentiment classification problem. We first show that the diversity of training data is indeed beneficial in cross-domain prediction. Then, we propose a taxonomy-based model combination algorithm (TBMC), which combines several models and reweigh their weights by tree-structured information. Besides, we also propose a taxonomy-based regression (TBRM) model to help the selection of the best source node.
The experimental results show that TMBC is really effective to deal with the cross domain sentiment classification problem, and TBRM also achieves better performance than the regression models without considering the taxonomy information in the best source selection problem. Finally, we further combine the two methods and integrate a transfer learning method to reach a better performance.


誌謝 i
中文摘要 ii
英文摘要 iii
目錄 v
附圖目錄 vii
附表目錄 x
第一章 緒論 1
1.1 意見探勘與跨領域情緒分類 1
1.2 研究動機與目的 2
1.3 論文架構 3
第二章 相關研究 4
2.1 跨領域情緒分類研究 4
2.2 最佳來源領域挑選 6
第三章 實驗資源與樹狀分類情緒分析 8
3.1 實驗資源與相關統計 8
3.2 相同領域之情緒分析探討 9
3.3 跨領域之情緒分析探討 12
3.4 分析結果總結 15
第四章 方法 18
4.1 以分類學為基礎之模型組合法(TBMC) 18
4.2 以分類學為基礎之回歸模型(TBRM) 21
第五章 實驗結果與討論 25
5.1 TBMC之實驗結果 26
5.1.1 TBMC之單一來源領域實驗結果 26
5.1.2 TBMC之多重來源實驗結果 28
5.1.3 TBMC之設定 29
5.2 TBRM之實驗結果與擴展應用 41
5.2.1 TBRM之實驗結果 41
5.2.2 TBRM之擴展應用 44
5.3 TBMC及TBRM之比較與討論 49
5.4 TBRM-ReweightedSum與遷移學習之結合 49
第六章 結論與未來研究方向 51
參考文獻 52
附錄 A 第一層來源和第二、三層最佳來源效能比較 56
附錄 B TBMC有、無權重調整之參數效能關係 59
附錄 C TBMC有、無權重調整之效能比較 63
附錄 D TBRM三個擴充模型之參數效能關係 69



[1]R. K. Ando and T. Zhang, “A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data,” J Mach Learn Res, vol. 6, pp. 1817–1853, Dec. 2005.
[2]A. Aue and M. Gamon, “Customizing sentiment classifiers to new domains: A case study,” In Proceedings of recent advances in natural language processing (RANLP), 2005, vol. 1, pp. 2–1.
[3]J. Gao, W. Fan, J. Jiang, and J. Han, “Knowledge transfer via multiple model local structure mapping,” In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 2008, pp. 283–291.
[4]S. Li and C. Zong, “Multi-domain adaptation for sentiment classification: Using multiple classifier combining methods,” In Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08. International Conference on, 2008, pp. 1–8.
[5]R. Xia and C. Zong, “A POS-based ensemble model for cross-domain sentiment classification,” In Proceedings of the 5th international Joint conference on natural Language Processing (iJcnLP-2010), 2011.
[6]Q. Zhou, Y. Zhang, and X. Hu, “An Ensemble Method Based on Confidence Probability for Multi-domain Sentiment Classification,” In Intelligent Computing Technology, Springer, 2012, pp. 214–220.
[7]A. B. Goldberg and X. Zhu, “Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization,” In Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, 2006, pp. 45–52.
[8]J. Meng and H. Lin, “Transfer learning based on graph ranking,” In Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on, 2012, pp. 1353–1357.
[9]N. Ponomareva and M. Thelwall, “Do neighbours help?: an exploration of graph-based algorithms for cross-domain sentiment classification,” In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp. 655–665.
[10]Q. Wu, S. Tan, and X. Cheng, “Graph ranking for sentiment transfer,” In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Stroudsburg, PA, USA, 2009, pp. 317–320.
[11]Q. Wu and S. Tan, “A two-stage framework for cross-domain sentiment classification,” Expert Syst. Appl., May 2011.
[12]R. K. Ando and T. Zhang, “A high-performance semi-supervised learning method for text chunking,” In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005, pp. 1–9.
[13]A. Andreevskaia and S. Bergler, “When specialists and generalists work together: Overcoming domain dependence in sentiment tagging,” Proc. Acl-08 Hlt, pp. 290–298, 2008.
[14]J. Blitzer, M. Dredze, and F. Pereira, “Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification,” In Annual Meeting-Association For Computational Linguistics, 2007, vol. 45, p. 440.
[15]J. Blitzer, R. McDonald, and F. Pereira, “Domain adaptation with structural correspondence learning,” In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA, 2006, pp. 120–128.
[16]W. Dai, G.-R. Xue, Q. Yang, and Y. Yu, “Co-clustering based classification for out-of-domain documents,” In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 2007, pp. 210–219.
[17]H. Daumé and D. Marcu, “Frustratingly easy domain adaptation,” In Annual meeting-association for computational linguistics, 2007, vol. 45, p. 256.
[18]S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, “Cross-domain sentiment classification via spectral feature alignment,” In Proceedings of the 19th international conference on World wide web, 2010, pp. 751–760.
[19]D. Bollegala, D. Weir, and J. Carroll, “Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classification,” In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011, vol. 1, pp. 132–141.
[20]N. Ponomareva and M. Thelwall, “Biographies or blenders: Which resource is best for cross-domain sentiment analysis?,” In Computational Linguistics and Intelligent Text Processing, Springer, 2012, pp. 488–499.
[21]C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” Acm Trans Intell Syst Technol, vol. 2, no. 3, pp. 27:1–27:27, May 2011.
[22]S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” Ieee Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010.
[23]C. Whitelaw, N. Garg, and S. Argamon, “Using appraisal groups for sentiment analysis,” In Proceedings of the 14th ACM international conference on Information and knowledge management, 2005, pp. 625–631.
[24]B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Found Trends Inf Retr, vol. 2, no. 1–2, pp. 1–135, Jan. 2008.
[25]B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: sentiment classification using machine learning techniques,” In Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10, Stroudsburg, PA, USA, 2002, pp. 79–86.
[26]B. Plank and G. van Noord, “Effective measures of domain similarity for parsing,” In Proceedings of ACL, 2011, pp. 1566–1576.
[27]V. Van Asch and W. Daelemans, “Using domain similarity for performance estimation,” In Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing, 2010, pp. 31–36.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
系統版面圖檔 系統版面圖檔