(3.238.206.122) 您好!臺灣時間:2021/04/21 09:22
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳柏宏
研究生(外文):CHEN, PO-HONG
論文名稱:假新聞的文字分析與辨識
論文名稱(外文):Text Analysis and Detection on Fake News
指導教授:黃胤傅
指導教授(外文):HUANG, YIN-FU
口試委員:賈坤芳楊東麟
口試委員(外文):JEA, KUEN-FANGYANG, DON-LIN
口試日期:2018-07-02
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:英文
論文頁數:36
中文關鍵詞:深度學習假新聞自然語言處理文字探勘
外文關鍵詞:deep learningfake newsnatural language processingtext mining
相關次數:
  • 被引用被引用:0
  • 點閱點閱:1342
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:67
  • 收藏至我的研究室書目清單書目收藏:1
一般來說,假新聞的特徵與真實新聞的特徵幾乎相同,因此識別它們並不容 易。在本文中,我們提出了一個使用深度學習模型的假新聞檢測系統。首先,根 據不同的訓練模型對新聞文章進行預處理和分析。然後結合 Embedding LSTM、 Depth LSTM、LIWC CNN 和 N-gram CNN 四種不同模型的集成學習模型,用於假新 聞檢測。此外為了實現高精度的檢測假新聞,使用自適性和弦搜尋演算法(SAHS) 來優化集成學習模型的權重。在實驗中,我們驗證了所提出的模型優於最先進的 方法,最高的準確率為 99.4%。我們還嘗試了跨資料集領域預測的問題,並達到 最高的 72.3%的準確性。最後,我們相信在跨領域預測方面還有改進集成學習 模式的空間。
In general, the features of fake news are almost the same as those of real news, so it is not easy to identify them. In this paper, we propose a fake news detection system using a deep learning model. First, news articles are preprocessed and analyzed based on different training models. Then, an ensemble learning model combining four different models called embedding LSTM, depth LSTM, LIWC CNN, and N-gram CNN is proposed for fake news detection. Besides, to achieve high accuracy of detecting fake news, the optimized weights of the ensemble learning model are determined using the Self-Adaptive Harmony Search (SAHS) algorithm. In the experiments, we verify that the proposed model is superior to the state-of-the-art methods, with the highest accuracy 99.4%. Furthermore, we also investigate the crossdomain intangibility issue and achieve the highest accuracy 72.3%. Finally, we believe there is still room for improving the ensemble learning model in addressing the crossdomain intangibility issue.
摘要 i
Abstract ii
誌謝 iii
Table of Contents iv
List of Tables v
List of Figures vi
1 Introduction 1
2 System Overviews 3
3 Preprocessing and Feature Analysis 4
3.1 Word Embedding 4
3.2 Grammar Analysis 4
3.2.1 Mean 5
3.2.2 Q25 5
3.2.3 Q50 6
3.2.4 Q75 6
3.2.5 Max 7
3.2.6 Min 7
3.2.7 Standard Deviation 8
3.3 Text Analysis Using LIWC 8
3.4 N-grams 9
4 Ensemble Learning Model and Optimizing Weights 12
4.1 Embedding LSTM 12
4.2 Depth LSTM 13
4.3 LIWC CNN 15
4.4 N-gram CNN 16
4.5 Optimizing Weights 18
5 Experiments 20
5.1 Datasets 20
5.1.1 Datasets Used in the Previous Research 20
5.1.2 Datasets Used to Investigate the Cross-domain Intangibility 21
5.2 Comparison with the Other Existing Methods 21
5.3 Cross-domain Intangibility 22
6 Conclusions 24
References 25

[1]K. Aderghal, J. Benois-Pineau, K. Afdel, and C. Gwenaëlle, “FuseMe: classification of sMRI images by fusion of deep CNNs in 2D+ϵ projections,” Proc. the 15th International Workshop on Content-Based Multimedia Indexing, Article No. 34, 2017.
[2]H. Ahmed, I. Traore, and S. Saad, “Detection of online fake news using N-Gram analysis and machine learning techniques,” Proc. International Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments, pp. 127-138, 2017.
[3]N. C. Camgoz, S. Hadfield, O. Koller, and R. Bowden, “Using convolutional 3D neural networks for user-independent continuous gesture recognition,” Proc. the 23rd International Conference on Pattern Recognition, pp. 49-54, 2016.
[4]W. B. Cavnar and J. M. Trenkle, “N-Gram-based text categorization,” Proc. the 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 161-175, 1994.
[5]H. Ceylan and H. Ceylan, “A hybrid harmony search and TRANSYT hill climbing algorithm for signalized stochastic equilibrium transportation networks,” Transportation Research Part C: Emerging Technologies, Vol. 25, pp. 152-167, 2012.
[6]J. Fourie, S. Mills, and R. Green “Harmony filter: a robust visual tracking system using the improved harmony search algorithm,” Image and Vision Computing. Vol. 28, No. 12, pp. 1702-1716, 2010.
[7]S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, Vol. 9, No. 8, pp. 1735-1780, 1997.
[8]Y. F. Huang and C. M. Wang, “Self-adaptive harmony search algorithm for optimization,” Expert Systems with Applications, Vol. 37, No. 4, pp. 2826-2837, 2010.
[9]R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, Vol. 97, No. 1-2, pp. 273-324, 1997
[10]J. Ma, W. Gao, P. Mitra, S. Kwon, B. J. Jansen, K. F. Wong, and M. Cha, “Detecting rumors from microblogs with recurrent neural networks,” Proc. the 25th International Joint Conference on Artificial Intelligence, pp. 3818-3824, 2016.
[11]V. Pérez-Rosas, B. Kleinberg, A. Lefevre, and R. Mihalcea, “Automatic detection of fake news,” arXiv preprint arXiv:1708.07104v1, 8-2017.
[12]M. Potthast, J. Kiesel, K. Reinartz, J. Bevendorff, and B. Stein, “A stylometric inquiry into hyperpartisan and fake news,” arXiv preprint arXiv:1702.05638, 2-2017.
[13]R. Pradhan, R. S. Aygun, M. Maskey, R. Ramachandran, and D. J. Cecil, “Tropical cyclone intensity estimation using a deep convolutional neural network,” IEEE Transactions on Image Processing, Vol. 27, No. 2, pp. 692-702, 2018.
[14]H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, and Y. Choi, “Truth of varying shades: analyzing language in fake news and political fact-checking,” Proc. the Conference on Empirical Methods in Natural Language Processing, pp. 2931-2937, 2017.
[15]V. L. Rubin, N. J. Conroy, Y. Chen, and S. Cornwell, “Fake news or truth? using satirical cues to detect potentially misleading news,” Proc. the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 7-17, 2016.
[16]S. Schuster and C. D. Manning, “Enhanced English universal dependencies: an improved representation for natural language understanding tasks,” Proc. the 10th International Conference on Language Resources and Evaluation, pp. 2371-2378, 2016.
[17]L. Shi, X. Ma, L. Xi, Q. Duan, and J. Zhao, “Rough set and ensemble learning based semi-supervised algorithm for text classification,” Expert Systems with Applications, Vol. 38, No. 5, pp. 6300-6306, 2011.
[18]Y. R. Tausczik and J. W. Pennebaker, “The psychological meaning of words: LIWC and computerized text analysis methods,” Journal of Language and Social Psychology, Vol. 29, No. 1, pp. 24-54, 2010.
[19]S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, and K. Saenko, “Sequence to sequence - video to text”, Proc. IEEE International Conference on Computer Vision, pp. 4534-4542, 2015.
[20]S. Vosoughi, M. Mohsenvand, and D. Roy, “Rumor gauge: predicting the veracity of rumors on twitter,” ACM Transactions on Knowledge Discovery from Data, Vol. 11, No. 4, Article No. 50, 2017.
[21]S. Wang, Y. Yin, G. Cao, B. Wei, Y. Zheng, and G. Yang, “Hierarchical retinal blood vessel segmentation based on feature and ensemble learning,” Neurocomputing, Vol. 149, Part B, pp. 708-717, 2015.
[22]B. Zoph and K. Knight, “Multi-source neural translation,” arXiv preprint arXiv:1601.00710v1, 1-2016.
[23]https://www.1843magazine.com/technology/rewind/the-true-history-of-fake-news
[24]https://www.axios.com/everything-trump-has-called-fake-news-1513303959-6603329e-46b5-44ea-b6be-70d0b3bdb0ca.html
[25]https://www.wired.com/2017/02/veles-macedonia-fake-news/
[26]http://money.cnn.com/2017/11/16/technology/tech-trust-indicators/index.html
[27]https://www.kaggle.com/c/otto-group-product-classification-challenge/discussion/14335
[28]https://www.kaggle.com/c/quora-question-pairs/discussion/34355
[29]https://github.com/GeorgeMcIntire/fake_real_news_dataset
[30]https://www.datasciencecentral.com/profiles/blogs/on-building-a-fake-news-classification-model
[31]https://www.allsides.com/unbiased-balanced-news
[32]https://www.kaggle.com/mrisdal/fake-news
[33]https://code.google.com/archive/p/word2vec/
[34]https://www.kaggle.com/ciotolaaaa/snopes-fake-legit-news
[35]https://www.kaggle.com/jruvika/fake-news-detection

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔