跳到主要內容

臺灣博碩士論文加值系統

(44.211.26.178) 您好!臺灣時間:2024/06/15 03:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:曾議慶
研究生(外文):Yi-Ching Zeng
論文名稱:建立線上消費者評論之有用意見模型
論文名稱(外文):Modeling the Helpful Opinion Mining of Online Consumer Reviews 
指導教授:吳世弘吳世弘引用關係
指導教授(外文):Shih-Hung Wu
學位類別:碩士
校院名稱:朝陽科技大學
系所名稱:資訊工程系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:英文
論文頁數:44
中文關鍵詞:輿論情緒分析原因探勘文本探勘意見探勘
外文關鍵詞:reason miningText miningopinion miningsentiment analysis
相關次數:
  • 被引用被引用:0
  • 點閱點閱:664
  • 評分評分:
  • 下載下載:79
  • 收藏至我的研究室書目清單書目收藏:2
文本探勘和意見探勘是近年新興的研究方向,其中找到一篇意見的極性是一個熱門的話題。但是發現為什麼一個使用者給一個積極或消極看法背後的原因更有趣。隨著 Web 的迅速增長,從其他使用者的批註收集相關資訊,將成為個人及組織決策的必要步驟。
網上消費者的意見挖掘問題,消費者只想讀取有用的意見,有用的評論才決定是否購買產品,為什麼喜歡或討厭一個產品,公司也可以找到真正原因。找出背後的原因,首先要區分句子除了顯示情感之外有沒有給出"有用(Helpful)"或"比較沒有用(Less-Helpful)"。
如果句中有原因,我們認為作者是認真地撰寫評論。我們的研究可以刪除雜訊評論,讓使用者和公司快速地理解其他人為什麼喜歡或不喜歡某個對象。
研究的第一步是打造實驗語料庫,手工收集Amazon 評論分別有八大類別,分別為書、相機、電腦、食物、電影、鞋子、玩具和手機。我們定義句子類型為"Positive Helpful"、"Negative Helpful"與" Positive / Negative Less-Helpful"。Connors的文章中使用人進行分析所提出的"Helpful"或"Less-Helpful"的十個特徵。我們實作了其中八種有用的特徵,可以用來支援我們的研究目標。設計相應的特徵提取程式,分別是“優點和缺點(Pros and Cons)”、“Unigram在產品使用資訊(Unigram of Product Usage Information)” 、“Brigram在(Brigram of Product Usage Information) 、“Trigram在(Trigram of Product Usage Information)、“細節(Detail)”、“比較(Comparisons)”、“長度(Lengthy)”、“評價星星數(Use of Ratings)”。我們可以建立一個分類器,以找出更好的評論。
實驗結果顯示在定義三個類別平均準確性在73%,Helpful negative的準確度74%和召回率64%,Helpful positive的準確度82%和召回率77%,Less-Helpful準確度87%和召回率73%。在所使用的八大特徵中,一一測試每一個特徵,找尋哪個特徵最為重要,經過實驗可以發現“Detail”,重要性為最重要,當去除“Detail”特徵時則準確率降為38.569%。
In the recent researches of text mining and opinion mining, finding the polarity of an opinion is a hot topic. However, the reason why a user gives a positive or a negative opinion is more interesting in the same context. In recent years, with the rapid growth of the Web, gathering information from other user’s comment becomes a necessary step on decision making for people or organization.
To mine the reason behind the opinion, we would like to distinguish the sentences as showing emotion with “Helpful” or “Less-Helpful”. If the sentences content the reason, we think that the author is serious to write the reviews. Our research can help the user and company quickly to understand why people like or dislike something and remove noisy reviews.
Finding helpful reviews is important. Helpful reviews can give the readers ideals. Noisy reviews just waste time for watching, so reading only helpful reviews not only see reason but also understand quickly.
The first step of the research is to create an experiment corpus. We collect Amazon review include Books、Digital_Camera、Computer、FoodsDrinks、Movies、Shoes、Toys and Cell-Phone eight classes. We manually define the sentence types as the one with “Helpful” and “Less-Helpful”. Connors’s paper defines the “Helpful” and “Less-Helpful” features. His paper use 10 features to analysis but it’s not automatically. We implement 8 features. The features are "Pros and Cons"、"Product Usage Information"、"Detail"、"Comparisons"、"Lengthy" and "Use of Ratings".
The overall accuracy of three-class problem is about 73%. Helpful negative reviews can be found with 82% precision and 77% recall. Helpful positive reviews can be found with 74% precision and 64% recall. Less-Helpful reviews can be filtered out automatically from all the consumer reviews with a high recall rate about 87% and 73% precision. Second experiment is finding most useful feature. “Detail” is most important of all.
目錄
摘要 I
ABSTRACT III
誌謝 V
目錄 VI
表目錄 VII
圖目錄 VIII
CHAPTER1: INTRODUCTION 1
1.1 MOTIVATION 1
1.2 METHOD 4
1.3 THESIS ORGANIZATION 5
CHAPTER2: RELATED WORKS 6
CHAPTER3: METHODOLOGY 11
3.1 DATA COLLECTION 11
3.2 CLASSIFICATION FEATURES 20
3.3 SUPPORT VECTOR MACHINE (SVM) 23
3.4 THREE-CLASS CLASSIFICATION PROBLEM 24
3.5 10-FOLD CROSS VALIDATION 25
CHAPTER4: EXPERIMENT 26
4.1 EXPERIMENT DESIGN 26
4.2 EXPERIMENT RESULT 34
CHAPTER5: CONCLUSION AND FUTURE WORKS 39
5.1 CONCLUSION 39
5.2 FUTURE WORKS 39
REFERENCES 41

表目錄
Table 1: The 15 reasons that people think a customer review helpful and the 10 reasons of Less-Helpful (Connors et al., 2011) [9] 9
Table 2: Eight product for defined the LSC threshold in first experiment. 12
Table 3: The summary of our data collection have 8 product and 8,690 reviews. 12
Table 4: Eight Features used in 21
Table 5: Eight Features used in our system 22
Table 6: Testing dataset include three classes. 27
Table 7: For the first experiment over “1.039” LSC threshold at eight Classifications. 27
Table 8: For the first experiment over “1.5” LSC threshold at eight Classifications. 27
Table 9: For the first experiment over “2.0” LSC threshold at eight Classifications. 28
Table 10: The size of the three classes with the threshold “1.039” 29
Table 11: The size of the three classes with the threshold “1.5” 29
Table 12: The size of the three classes with the threshold “2.0”. 29
Table 13: The average accuracy result of each data set in the ten-fold cross validation. 30
Table 14: When LSC is 1.039.The accuracy result of each fold in the Ten-fold cross validation result and the average accuracy 30
Table 15: When LSC is 1.5. The accuracy result of each fold in the Ten-fold cross validation result and the average accuracy 32
Table 16: When LSC is 2.0. The accuracy result of each fold in the Ten-fold cross validation result and the average accuracy 33
Table 17: The confusion matrix (LSC threshold is over 1.039) 35
Table 18: The confusion matrix (LSC threshold is over 1.5) 35
Table 19: The confusion matrix (LSC threshold is over 2.0) 36
Table 20: Features Analysis 36

圖目錄
Figure 1. Helpfulness mining and sentiment analysis 3
Figure 2. An Amazon customer review example. 5
Figure 3. System Architecture 15
Figure 4.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 16
Figure 5.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 16
Figure 6.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 17
Figure 7.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 17
Figure 8.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 18
Figure 9.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 18
Figure 10.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 19
Figure 11.Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 19
Figure 12. Example of review 21
Figure 13: Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. The LSC is over1.039. 31
Figure 14: Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. 32
The LSC is over1.5. 32
Figure 15: Stars vs. helpfulness distribution of our data collection. The x-axis is the number of stars of customer reviews; the y-axis is the helpfulness score LSC. The LSC is over 2.0. 33
Figure 16. A false positive example. 37
Figure 17. An error analysis example. 38
[1]Meng-Xiang Li, Liqiang Huang, Chuan-Hoo Tan, and Kwok-Kee Wei, “Assessing The Helpfulness Of Online Product Review: A Progressive Experimental Approach”, In Proceedings of PACIS, 2011.
[2]Minqing Hu, Bing Liu. “Mining opinion features in customer reviews”. In Proceedings of the 19th national conference on Artifical intelligence (AAAI''04), Anthony G. Cohn (Ed.). AAAI Press pp.755-760, 2004.
[3]Soo-Min Kim, Patrick Pantel, Tim Chklovski and Marco Pennacchiotti, “Automatically Assessing Review Helpfulness”, Conference on Empirical Methods in Natural Language
Processing (EMNLP), 2006.
[4]Soo-Min Kim, Eduard Hovy, “Extracting Opinions Expressed in Online News Media Text with Opinion Holdersand Topics”, In COLING-ACL06, 2006
[5]Susan M. Mudambi, David Schuff, “What Makes a Helpful Online Review? A Study of Customer Reviews on Amazon.com”, MIS Quarterly, (Vol. No34: 1) pp.185-200, 2010.
[6]Samaneh Moghaddam, Mohsen Jamali and Martin Ester, “Review Recommendation: Personalized Prediction of the Quality of Online Reviews”, Proceedings of the 20th ACM international conference on Information and knowledge management pp.2249-2252, 2010.
[7]Yue Lu, Panayiotis Tsaparas, Alexandros Ntoulas and Livia Polanyi. “Exploiting Social Context for Review Quality Prediction”, Proceedings of the 19th international conference on World wide web pp. 691-700, 2010.
[8]Wenting Xiong, Diane Litman, “Automatically Predicting Peer-Review Helpfulness”, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: short papers, pp. 502–507, 2011.
[9]Laura Connors, Susan M. Mudambi, and David Schuff. “Is it the Review or the Reviewer? A Multi-Method Approach to Determine the Antecedents of Online Review Helpfulness”, Proceedings of the 2011 Hawaii International Conference on Systems Sciences (HICSS), January, 2011.
[10]Changhua Yang, Kevin Hsin-Yih Lin and Hsin-Hsi Chen, “Emotion Classification Using Web Blog Corpora”, IEEE Computer Society Washington, DC, USA 2007
[11]NTUSD(NationalTaiwanUniversitySemanticDictionary)
Available:http://nlg18.csie.ntu.edu.tw:8080/opinion/pub1.html
[12]Malik Muhammad, Saad Missen, Mohand Boughanem and Guillaume Cabanac, “Opinion Finding in Blogs: A Passage-Based Language Modeling Approach”, RIAO ''10 Adaptivity, Personalization and Fusion of Heterogeneous Information, 2010.
[13]NTCIR. (Online). Available: http://research.nii.ac.jp/ntcir/index-en.html
[14]Wan-Chi Huang, Meng-Chun Lin and Shih-Hung Wu, “Opinion Sentences Extraction and PolarityClassification Using Automatically GeneratedTemplates”, Proceedings of NTCIR-8 Workshop Meeting, June 15–18, 2010.
[15]Kang Liu, Jun Zhao, “NLPR at Multilingual Opinion Analysis Task in NTCIR7”, In Proceedings of NTCIR-7 Workshop Meeting, December 16–19, 2008.
[16]Lun-Wei Ku, I-Chien Liu, Chia-Ying Lee, Kuan-hua Chen and Hsin- His Chen, “Sentence-Level Opinion Analysis by CopeOpi in NTCIR-7”, In Proceedings of NTCIR-7 Workshop Meeting, December 16–19, 2008.
[17]Yejun Wu and Douglas W. Oard, NTCIR-6 at Maryland: Chinese Opinion Analysis Pilot Task, Proceedings of NTCIR-6 Workshop Meeting, May 15-18, 2007.
[18]Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. “Thumbs up? Sentiment classification using machine learning techniques”, In Proceedings of the EMNLP Conference, 2002.
[19]Bo Pang, Lillian Lee. “A sentimental education: Sentiment analysis using subjectivity summarization based onminimum cuts. In ProcACL, 2004.
[20]Craig Macdonald, IadhOunis, Ian Soboroff, Is spam an issue for opinionated blog post search?, SIGIR ''09 Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, 2009.
[21]Peter Turney. “Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews”. In Proceedings of the 40th ACL Conference, 2002.
[22]Jonathon Read. “Using Emoticons to reduce Dependency in Machine Learning Techniques for Sentiment Classification”. Proceedings of Proceedings of the ACL Student Research Workshop, pp. 43–48, 2005.
[23]Hung-Yu Kao and Zi-Yu Lin, “A Categorized Sentiment Analysis of Chinese Reviews by Mining Dependency inProduct Features and Opinions from Blogs” , IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2010.
[24]RuifengXu, Kam-Fai Wong, Qin Lu, Yunqing Xia and Wenjie Li, “Learning Knowledge from Relevant Webpage for Opinion Analysis”, IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2008.
[25]Nikos Engonopoulos, Angeliki Lazaridou and Georgios Paliouras, “ELS: A Word-Level Method for Entity-Level SentimentAnalysis”, WIMS’11 , 2011.
[26]Yahoo Movie Available:http://tw.movie.yahoo.com/
[27]Daniel E. O''Leary, “Blog mining-review and extensions:From each according to his opinion”, Decision Support Systems 51, 2011.
[28]LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/
[29]Choi, C. Cardie, E. Riloff, and S. Patwardhan. Identifyingsources of opinions with conditional random fields andextraction patterns. In Proc. HLT/EMNLP, 2005.
[30]Yejin Choi, Eric Breck, and Claire Cardie. “Joint extraction of entitiesand relations for opinion recognition”, In Proc. EMNLP, 2006.
[31]Rushdi Saleh, Martin Valdivia, Montejo-Raez and Urena-Lopez, “Experiments with SVM to classify opinions in different domains”, Expert Systems with Applications , 2011.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊