跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.138) 您好!臺灣時間:2025/12/04 18:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:江彥孟
研究生(外文):CHIANG,YAN-MENG
論文名稱:用深度學習於網路評論的分析
論文名稱(外文):An empirical study on detecting fake reviews using deep learning and machine learning techniques
指導教授:鄭麗珍鄭麗珍引用關係
指導教授(外文):Li-Chen Cheng
口試委員:沈錳坤李永銘游政憲
口試委員(外文):Shan, Man-KwanLee, Yung-MingCheng-Hsien Yu
口試日期:2018-06-27
學位類別:碩士
校院名稱:東吳大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:中文
論文頁數:53
中文關鍵詞:假評論文字探勘深度學習
外文關鍵詞:Fake reviewText MiningDeep Learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:551
  • 評分評分:
  • 下載下載:9
  • 收藏至我的研究室書目清單書目收藏:1
在市場經濟中網路上的企業占的比重越來越多,導致了網路上的評論有很大影響力和重要性,在購買產品之前,有越來越多的使用者會使用論壇來查詢該產品的意見或是分享使用產品的體驗,然而在現實世界中充滿著許多的假評論,顧客無法很明確的識別非假評論和假評論的區別,在網路上的假評論可能影響著顧客的購買決定,因此本研究提出了一個能識別假評論的模型,在本研究中使用了論壇資料是台灣知名論壇Mobile01,並使用了文字探勘的技術來處理非結構性資料,包含了詞袋模型、潛在語義分析、word2vec。接下來本研究用機器學習來訓練一個能識別假評論的分類器,使用了SVM和使用深度學習的方式包含了深度神經網絡(DNN)、和卷積神經網絡(CNN)、長期短期記憶(LSTM),最後本研究選出三個最佳效能的模型找出疑似是寫手的評論,讓未來想做此研究增加了寫手的樣本數。
The increasing share of the online businesses in market economy has led to a larger influence and importance of the online reviews. Before making a purchase, users are increasingly inclined to browse online forum that are posted to share post-purchase experiences of products and services. However, there are many fake reviews in the real world, consumers can't identify authentic and fake reviews. Fake online shopping reviews are harmful to consumers who might buy misrepresented products. Therefore, we proposed a framework which could detect fake reviews. In this study, we focused on the data on the web forum called Mobile01 and used text mining to deal with textual data including Bag-of-words, Latent Semantic Analysis and word2vec for word representation. Next, we used machine learning to train the model to detect fake review, including SVM, Deep Neural Network (DNN), Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). Finally, we chose three best performance models to vote and hope that these fake reviews samples can be the reference in the future research.
表目錄 v
圖目錄 vi
1、緒論 1
2、文獻回顧 3
2.1假評論 3
2.2文字探勘 4
2.3文件分類 5
2.4深度學習 6
2.4.1卷積神經網路(CNN, Convolutional Neural Network) 7
2.4.2長短期記憶(LSTM, Long Short-Term Memory) 7
3、研究方法 9
3.1研究架構 9
3.2資料來源 10
3.3資料前處理 11
3.3.1斷詞停用詞去除 11
3.3.2向量表達 14
3.4模型建置 16
3.5模型評估 17
4、實驗設計 19
4.1實驗資料 19
4.2資料切割 19
4.3利用發文的內容建立分類模型 20
4.3.1實驗1:使用BOW與不同分類法結果 21
4.3.2實驗2:使用LSA與不同分類法結果 22
4.3.3實驗3:使用word2vec與不同分類法結果 24
4.3.4小結 25
4.4只使用量化指標 31
4.4.1實驗4:量化指標使用不同分類法結果 32
4.5發文內容結合量化指標 34
4.6使用不同分類器挑出疑似是寫手的文章 36
5、結論與未來研究 38
參考文獻 40

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Chen, M.-S., Han, J., & Yu, P. S. (1996). Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and data Engineering, 8(6), 866-883.
Chen, Y.-R., & Chen, H.-H. (2015). Opinion spam detection in web forum: a real case study. Paper presented at the Proceedings of the 24th International Conference on World Wide Web.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391.
Deng, L. (2014). A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing, 3.
Dong, L., Wei, F., Zhou, M., & Xu, K. (2014). Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis. Paper presented at the AAAI.
Feng, S., Banerjee, R., & Choi, Y. (2012). Syntactic stylometry for deception detection. Paper presented at the Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2.
Gao, Y., Zhou, L., Zhang, Y., Xing, C., Sun, Y., & Zhu, X. (2010). Sentiment classification for stock news. Paper presented at the Pervasive Computing and Applications (ICPCA), 2010 5th International Conference on Pervasive Computing and Applications.
Gayathri, K., & Marimuthu, A. (2013). Text document pre-processing with the KNN for classification using the SVM. Paper presented at the Intelligent Systems and Control (ISCO), 2013 7th International Conference on Intelligent Systems and Control (ISCO).
Graves, A., Jaitly, N., & Mohamed, A.-r. (2013). Hybrid speech recognition with deep bidirectional LSTM. Paper presented at the Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
Hearst, M. A. (1999). Untangling text data mining. Paper presented at the Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics.
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504-507.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
Jindal, N., & Liu, B. (2007). Review spam detection. Paper presented at the Proceedings of the 16th international conference on World Wide Web.
Jindal, N., & Liu, B. (2008). Opinion spam and analysis. Paper presented at the Proceedings of the 2008 International Conference on Web Search and Data Mining.
Jindal, N., Liu, B., & Lim, E.-P. (2010). Finding unusual review patterns using unexpected rules. Paper presented at the Proceedings of the 19th ACM international conference on Information and knowledge management.
Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. Machine learning: ECML-98, 137-142.
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. Paper presented at the Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.
Kaur, S., & Khiva, N. K. (2016). Online news classification using Deep Learning Technique.
Krishnalal, G., Rengarajan, S. B., & Srinivasagan, K. G. (2010). A New Text Mining Approach Based on HMM-SVM for Web News Classification. International Journal of Computer Applications, 1(19), 103-109. doi:10.5120/395-589
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Paper presented at the Advances in neural information processing systems.
Kumar, R. B., Kumar, B. S., & Prasad, C. S. S. (2012). Financial news classification using SVM. International Journal of Scientific and Research Publications, 2(3), 1-6.
Lam, W., Ruiz, M., & Srinivasan, P. (1999). Automatic text categorization and its application to text retrieval. IEEE Transactions on Knowledge and Data engineering, 11(6), 865-879.
Lauren, S., & Harlili, S. D. (2014). Stock trend prediction using simple moving average supported by news classification. Paper presented at the Advanced Informatics: Concept, Theory and Application (ICAICTA), 2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA).
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
Li, J., Fong, S., Zhuang, Y., & Khoury, R. (2016). Hierarchical classification in text mining for sentiment analysis of online news. Soft Computing, 20(9), 3411-3420.
Lim, E.-P., Nguyen, V.-A., Jindal, N., Liu, B., & Lauw, H. W. (2010). Detecting product review spammers using rating behaviors. Paper presented at the Proceedings of the 19th ACM international conference on Information and knowledge management.
Liu, B. (2007). Web data mining: exploring hyperlinks, contents, and usage data: Springer Science & Business Media.
Liu, B. (2013). Opinion Spam Detection: Detecting Fake Reviews and Reviewers.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., & Khudanpur, S. (2010). Recurrent neural network based language model. Paper presented at the Interspeech.
Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Spotting opinion spammers using behavioral footprints. Paper presented at the Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining.
Mukherjee, A., Liu, B., & Glance, N. (2012). Spotting fake reviewer groups in consumer reviews. Paper presented at the Proceedings of the 21st international conference on World Wide Web.
Mukherjee, A., Venkataraman, V., Liu, B., & Glance, N. S. (2013). What yelp fake review filter might be doing? Paper presented at the ICWSM.
Ott, M., Cardie, C., & Hancock, J. (2012). Estimating the prevalence of deception in online review communities. Paper presented at the Proceedings of the 21st international conference on World Wide Web.
Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding deceptive opinion spam by any stretch of the imagination. Paper presented at the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., . . . Bernstein, M. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252.
Sak, H., Senior, A., & Beaufays, F. (2014). Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. arXiv preprint arXiv:1402.1128.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47.
Sullivan, D. (2001). Document warehousing and text mining: techniques for improving business operations, marketing, and sales: John Wiley & Sons, Inc.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., . . . Rabinovich, A. (2015). Going deeper with convolutions. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
Tang, D., Qin, B., & Liu, T. (2015). Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. Paper presented at the EMNLP.
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
Wang, G., Xie, S., Liu, B., & Philip, S. Y. (2011). Review graph based online store review spammer detection. Paper presented at the Data mining (icdm), 2011 ieee 11th international conference on Data Mining.
Wang, Z. (2010). Anonymity, social image, and the competition for volunteers: a case study of the online market for reviews. The BE Journal of Economic Analysis & Policy, 10(1).
Xie, S., Wang, G., Lin, S., & Yu, P. S. (2012). Review spam detection via temporal pattern discovery. Paper presented at the Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining.
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A. J., & Hovy, E. H. (2016). Hierarchical Attention Networks for Document Classification. Paper presented at the HLT-NAACL.
Zhang, D., Zhou, L., Kehoe, J. L., & Kilic, I. Y. (2016). What Online Reviewer Behaviors Really Matter? Effects of Verbal and Nonverbal Behaviors on Detection of Fake Online Reviews. Journal of Management Information Systems, 33(2), 456-481. doi:10.1080/07421222.2016.1205907


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊