跳到主要內容

臺灣博碩士論文加值系統

(44.222.82.133) 您好!臺灣時間:2024/09/21 02:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李紹誠
研究生(外文):Lee, Shao-Cheng
論文名稱:RapidMiner 搭配異常偵測於信用卡詐欺檢測
論文名稱(外文):Apply RapidMiner and anomaly detection to credit card fraud identification
指導教授:洪士程洪士程引用關係
指導教授(外文):Horng, Shih-Cheng
口試委員:陳政宏林謝興
口試委員(外文):Chen, Cheng-HungLin, Shieh-Shing
口試日期:2024-07-02
學位類別:碩士
校院名稱:朝陽科技大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:中文
論文頁數:59
中文關鍵詞:信用卡詐欺RapidMiner機器學習Kaggle異常偵測
外文關鍵詞:Credit card fraudRapidMinermachine learningKaggleanomaly detection
相關次數:
  • 被引用被引用:0
  • 點閱點閱:17
  • 評分評分:
  • 下載下載:2
  • 收藏至我的研究室書目清單書目收藏:0
詐騙和詐欺行為都是指一個或多個人犯下的欺騙行為,旨在非法獲取
利益或獲利。這些行為通常具有動態性,並且缺乏固定的模式,因此識別它
們都是一項挑戰。詐騙者和詐欺者通常利用最新的技術發展,找到系統的漏
洞,以欺騙交易流程,導致許多損失。為防止信用卡詐欺問題持續發生,近
年來有許多研究利用機器學習技術來進行信用卡的詐欺檢測以防止不法活
動。RapidMiner 是一個用於資料探勘、機器學習和商業預測分析的開源計
算環境。本論文利用資料探勘工具Rapidminer,對取自Kaggle 資料集的信
用卡交易數據集進行異常偵測,總共有 284,807 筆信用卡交易,其中詐欺
案例有 492 筆。採用的機器學習技術包括羅吉斯迴歸、決策樹、隨機森林、
支援向量機和神經網路等,以及比較各種分類模型的效能。實驗結果顯示,
透過適當的數據預處理和模型調整,可以有效地提高詐騙檢測系統的性能,
從而防止詐欺活動的發生。
Scams and deceptive practices are deceptive acts committed by one or more
persons with the intent of unlawfully obtaining an advantage or profit. These
behaviors are often dynamic and lack fixed patterns, making identifying them a
challenge. Scammers and fraudsters often take advantage of the latest
technological developments and find loopholes in the system to deceive the
transaction process, resulting in many losses. In order to prevent the continued
occurrence of credit card fraud, many studies have used machine learning
technology to detect credit card fraud in recent years to prevent illegal activities.
RapidMiner is an open-source computing environment for data exploration,
machine learning, and business predictive analytics. This study will use the data
mining tool Rapidminer to detect anomalies in the credit card transaction data set
taken from the Kaggle data set. There are a total of 284,807 credit card
transactions, including 492 fraud cases. The machine learning techniques used
include Logistic regression, decision trees, random forests, support vector
machines and neural networks, etc., and the performance of various classification
models is compared. Experimental results show that through appropriate data
preprocessing and model adjustment, the performance of the fraud detection
system can be effectively improved, thereby preventing the occurrence of
fraudulent activities.
目錄
第一章、緒論 .................................................. 1
1.1 研究動機與目的 .......................................... 1
1.2 研究方法與論文架構 ...................................... 2
第二章、文獻探討 .............................................. 4
2.1 信用卡詐欺現況 .......................................... 4
2.2 異常檢測 ................................................ 7
2.3 常見演算法 .............................................. 8
2.3.1 羅吉斯迴歸 ............................................ 9
2.3.2 隨機森林 ............................................. 10
2.3.3 決策樹 ............................................... 11
2.3.4 支援向量機 ........................................... 13
2.3.5 神經網路 ............................................. 14
第三章、資料探勘軟體 ......................................... 16
3.1 數據預處理 ............................................. 17
3.2 數據可視化 ............................................. 19
3.3 RapidMiner 操作流程 .................................... 20
第四章、實驗設計 ............................................. 27
4.1 實驗數據 ............................................... 27
4.2 實驗模型 ............................................... 28
4.2.1 羅吉斯迴歸模型 ........................................ 30
4.2.2 決策樹模型 ............................................ 33
4.2.3 隨機森林模型 .......................................... 36
4.2.4 支援向量機模型 ........................................ 39
4.2.5 神經網路模型 .......................................... 42
第五章、實驗結果 ............................................. 45
5.1. 模型分析 ............................................... 45
5.1.1 羅吉斯迴歸模型分析 .................................... 47
5.1.2 決策樹模型分析 ........................................ 48
5.1.3 隨機森林模型分析 ...................................... 49
5.1.4 支援向量機模型分析 .................................... 50
5.1.5 神經網路模型分析 ...................................... 51
5.1.6 各個模型SOM 分析圖 .................................... 52
5.2 模型性能比較 ............................................ 54
第六章、結論與未來方向 ....................................... 55
參考文獻 .................................................... 57

圖目錄
圖 1.論文架構....................................................................................................... 3
圖2.信用卡EC 交易刷卡筆數與占比................................................................ 4
圖3.信用卡詐欺數量與年份關係....................................................................... 5
圖4. RapidMiner 登入畫面................................................................................ 16
圖5.資料集V21-V25 經PCA 轉換數值.......................................................... 18
圖6.資料集V26-V28、Amount、Class 的數據特徵...................................... 18
圖7. RapidMiner 主畫面.................................................................................... 22
圖8. RapidMiner 建立新的存儲庫................................................................... 22
圖9.RapidMiner 導入數據集............................................................................ 23
圖10.數據集屬性類別...................................................................................... 24
圖11.數據集格式.............................................................................................. 24
圖12.數據集匯入結果...................................................................................... 25
圖13. RapidMiner 數據處理流程..................................................................... 26
圖14. RapidMiner 模型評估結果..................................................................... 26
圖15.信用卡交易資料集.................................................................................. 27
圖16.邏輯迴歸模型流程.................................................................................. 31
圖17.羅吉斯迴歸模型資料分割(Split Data) ................................................... 31
圖18.羅吉斯迴歸模型參數設定...................................................................... 32
圖19.決策樹模型流程...................................................................................... 34
圖20.決策樹模型資料分割(Split Data) ........................................................... 34
圖21.決策樹模型參數設定.............................................................................. 35
圖22.樹狀圖結果.............................................................................................. 35
圖23.隨機森林模型流程.................................................................................. 37
圖24.隨機森林模型資料分割(Split Data) ....................................................... 37
圖25.隨機森林模型參數設定.......................................................................... 38
圖26.SVM 模型流程........................................................................................ 40
圖27.SVM 模型資料分割(Split Data) ........................................................ 40
圖28.SVM 模型參數設定................................................................................ 41
圖29.神經網路模型流程.................................................................................. 43
圖30.神經網路模型資料分割(Split Data) ........................................................ 43
圖31.神經網路模型參數設定.......................................................................... 44
圖32.羅吉斯迴歸模型混淆矩陣分析圖.......................................................... 47
圖33.決策樹模型混淆矩陣分析圖.................................................................. 48
圖34.隨機森林模型混淆矩陣分析圖.............................................................. 49
圖35.SVM 模型混淆矩陣分析圖.................................................................... 50
圖36.神經網路模型混淆矩陣分析圖.............................................................. 51
圖37.各個模型SOM ........................................................................................ 52
圖38.SOM 的異常值........................................................................................ 53

表目錄
表 1.國人卡片在國外通路之詐欺交易統計表.................................................. 6
表2.非面對面交易(EC)詐欺統計表................................................................... 7
表3.五種模型性能比較.................................................................................... 54

參考文獻
[1] A. Singh, A. Singh, A. Aggarwal and A. Chauhan, "Design and
Implementation of Different Machine Learning Algorithms for Credit Card
Fraud Detection," 2022 International Conference on Electrical, Computer,
Communications and Mechatronics Engineering (ICECCME), Maldives,
Maldives, 2022, pp. 1-6, doi: 10.1109/ICECCME55909.2022.9988588.
[2] I. Vejalla, S. P. Battula, K. Kalluri and H. K. Kalluri, "Credit Card Fraud
Detection Using Machine Learning Techniques," 2023 2nd International
Conference on Paradigm Shifts in Communications Embedded Systems,
Machine Learning and Signal Processing (PCEMS), Nagpur, India, 2023,
pp. 1-4, doi: 10.1109/PCEMS58491.2023.10136040.
[3] R. Aggarwal, P. K. Sarangi and A. K. Sahoo, "Credit Card Fraud Detection:
Analyzing the Performance of Four Machine Learning Models," 2023
International Conference on Disruptive Technologies (ICDT), Greater
Noida, India, 2023, pp. 650-654, doi: 10.1109/ICDT57929.2023.10150782.
[4] B. B. Jayasingh and G. B. Sri, "Online Transaction Anomaly Detection
Model for Credit Card Usage Using Machine Learning Classifiers," 2023
International Conference on Emerging Smart Computing and Informatics
(ESCI), Pune, India, 2023, pp. 1-5, doi:
10.1109/ESCI56872.2023.10100152.
[5] S. Dwivedi, P. Kasliwal and S. Soni, "Comprehensive study of data
analytics tools (RapidMiner, Weka, R tool, Knime)," 2016 Symposium on
Colossal Data Analysis and Networking (CDAN), Indore, India, 2016, pp.
1-8, doi: 10.1109/CDAN.2016.7570894.
[6] Credit Card Fraud Detection Using Kaggle Data Set and Anomaly
Detection. (n.d.). RapidMiner Community. Retrieved from
https://community.rapidminer.com/discussion/45727/credit-card-frauddetection-
using-kaggle-data-set-and-anomaly-detection
[7] T. Baabdullah, A. Alzahrani and D. B. Rawat, "On the Comparative Study
of Prediction Accuracy for Credit Card Fraud Detection wWith Imbalanced
Classifications," 2020 Spring Simulation Conference (SpringSim), Fairfax,
VA, USA, 2020, pp. 1-12, doi: 10.22360/SpringSim.2020.CSE.004.
[8] R. Ilieva and M. Angelov, "Template for Building Manageable Data Mining
Autonomous Process with RapidMiner Studio," 2021 XXX International
Scientific Conference Electronics (ET), Sozopol, Bulgaria, 2021, pp. 1-5,
doi: 10.1109/ET52713.2021.9580103.
[9] M. Devika, S. R. Kishan, L. S. Manohar and N. Vijaya, "Credit Card Fraud
Detection Using Logistic Regression," 2022 Second International
Conference on Advanced Technologies in Intelligent Control, Environment,
Computing & Communication Engineering (ICATIECE), Bangalore, India,
2022, pp. 1-6, doi: 10.1109/ICATIECE56365.2022.10046976.
[10] A. A. Khine and H. W. Khin, "Credit Card Fraud Detection Using Online
Boosting with Extremely Fast Decision Tree," 2020 IEEE Conference on
Computer Applications(ICCA), Yangon, Myanmar, 2020, pp. 1-4, doi:
10.1109/ICCA49400.2020.9022843.
[11] S. -I. Mihali and Ș. -L. Niță, "Credit Card Fraud Detection based on
Random Forest Model," 2024 International Conference on Development
and Application Systems (DAS), Suceava, Romania, 2024, pp. 111-114, doi:
10.1109/DAS61944.2024.10541240.
[12] S. K. Saddam Hussain, E. Sai Charan Reddy, K. G. Akshay and T.
Akanksha, "Fraud Detection in Credit Card Transactions Using SVM and
Random Forest Algorithms," 2021 Fifth International Conference on ISMAC
(IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam,
India, 2021, pp. 1013-1017, doi: 10.1109/I-SMAC52330.2021.9640631.
[13] S. K. Pradhan, N. V. Krishna Rao, N. M. Deepika, P. Harish, M. P. Kumar
and P. S. Kumar, "Credit Card Fraud Detection Using Artificial Neural
Networks and Random Forest Algorithms," 2021 5th International
Conference on Electronics, Communication and Aerospace Technology
(ICECA), Coimbatore, India, 2021, pp. 1471-1476, doi:
10.1109/ICECA52323.2021.9676142.
[14] S. Khatri, A. Arora and A. P. Agrawal, "Supervised Machine Learning
Algorithms for Credit Card Fraud Detection: A Comparison," 2020 10th
International Conference on Cloud Computing, Data Science &
Engineering (Confluence), Noida, India, 2020, pp. 680-683, doi:
10.1109/Confluence47617.2020.9057851.
[15] S. Mittal and S. Tyagi, "Performance Evaluation of Machine Learning
Algorithms for Credit Card Fraud Detection," 2019 9th International
Conference on Cloud Computing, Data Science & Engineering
(Confluence), Noida, India, 2019, pp. 320-324, doi:
10.1109/CONFLUENCE.2019.8776925.
[16] A. Shah and A. Mehta, "Comparative Study of Machine Learning Based
Classification Techniques for Credit Card Fraud Detection," 2021 International Conference on Data Analytics for Business and Industry
(ICDABI), Sakheer, Bahrain, 2021, pp. 53-59, doi:
10.1109/ICDABI53623.2021.9655848.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊