研究生(外文):Ping-Hsien Lin
論文名稱(外文):A Fraud Detection System for Real-time Messaging Communication on Android Facebook Messenger
指導教授(外文):Nai-Wei Lo
口試委員(外文):Nai-Wei Lo
外文關鍵詞:Fraud DetectionLatent Semantic AnalysisCosine Similarity
Recently, the popularity rate of the smartphone usage has rapidly risen. There is a variety of mobile applications which are developed, such as “Facebook”, “Line”, “WeChat”, etc. The applications not only make people communicate with each other more easily, but also help humans reduce extra fee of calling or sending short messages. However, when we enjoy the convenience of the smartphone, many potential risks will appear at the same time. For example, some of high risk permissions would let your personal privacy information be exposed. In Taiwan, fraudsters also use the applications as a fraud tool to complete their purpose of crime.
In this paper, we develop a fraud detection system of communications to solve the fraud problems. We use some technologies to process input data and verify feasibility of the fraud detection system, such as natural language processing, matrix processing, latent semantic analysis and cosine similarity. Then, we collect some news and cases about fraud event as training data for our fraud detection system and intercept the real-time message chat logs from “Facebook Messenger” as testing data. Finally, we develop a mobile application to warn the user whether the real-time message chat logs are fraud event or not.
中文摘要 I
Abstract II
誌謝 III
Contents IV
List of Figures V
List of Tables VI
Chapter 1 Introduction 1
Chapter 2 Preliminaries 5
2.1 Semantic Models 5
2.1.1 Latent Semantic Analysis 5
2.1.2 Probabilistic Latent Semantic Analysis 6
2.1.3 Latent Dirichlet Allocation 6
2.2 Decision Models 8
2.2.1 Cosine similarity 8
2.2.2 Jaccard Similarity 9
2.2.3 Dice Similarity 9
Chapter 3 The Proposed Fraud Detection System 10
3.1 System Architecture 10
3.2 Data Flow of the Fraud Detection System 10
3.3 Data Collection 11
3.4 Natural Language Processing 12
3.4.1 CKIP Word Segmentation 12
3.4.2 Stop Word 13
3.4.3 Special Symbol 13
3.5 Matrix Processing 13
3.5.1 Vector Space Model (VSM) 13
3.5.2 Term Frequency-Inverse Document Frequency Matrix 16
3.6 Latent Semantic Analysis 20
3.7 Classification Rules 28
Chapter 4 System Implementation, Testing Scenarios and Discussion 31
4.1 System Implementation 31
4.2 Testing Scenarios 33
4.3 Discussion 37
Chapter 5 Conclusion 38
References 39
