跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.106) 您好!臺灣時間:2026/04/03 16:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:曾順光
研究生(外文):ZENG, SHUN-GUANG
論文名稱:應用人工智慧演算法分析 影響網路金融詐騙交易之風險因子
論文名稱(外文):Using artificial intelligence algorithms to analyze the risk factors of internet financial fraud transactions
指導教授:李財福李財福引用關係
指導教授(外文):LEE, TSAIR-FWU
口試委員:蔡政男陳文平李財福
口試委員(外文):TSAI, CHENG-NANCHEN, W.P.LEE, TSAIR-FWU
口試日期:2021-08-06
學位類別:碩士
校院名稱:國立高雄科技大學
系所名稱:電子工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:63
中文關鍵詞:網路詐欺機器學習金融科技詐欺檢測
外文關鍵詞:Anomaly detectionMachine learningFintchFraud detection
相關次數:
  • 被引用被引用:1
  • 點閱點閱:1332
  • 評分評分:
  • 下載下載:262
  • 收藏至我的研究室書目清單書目收藏:2
目的 :本研究主要應用人工智慧演算法針對金融交易數據集進行分析,探討金融交易樣本特徵與詐欺交易行為之相關性,並建立預測模型檢測異常金融交易中之潛在詐欺行為。
材料與方法 : 本研究首先預處理交易樣本之缺失值並進行特徵資料型態轉換,透過探索性數據分析(Exploratory data analysis, EDA)數據集之交易特徵與異常交易行為,排除及建立有效之特徵因子數據集,並對特徵數據集標準化(Standardization)進行資料前處理優化,以監督式機器學習演算法排序特徵之重要性進一步探討特徵因子對於金融交易樣本之相關性。以有效之特徵因子數據集建立邏輯斯迴歸(Logistic regression, LR)、Adaboost (Adaptive Boosting, Adaboost) 演算法與隨機森林(Random forest, RF)演算法預測模型,並以網格搜尋法(Grid search)穩健優化演算法之模型,後透過準確率(Accuracy, ACC)、受試者特性曲線下面積(Area under the ROC curve, AUC)、真實陰性率(Negative predictive value, NPV)及真實陽性率(Positive predictive value, PPV)對模型進行效能評估。
結果 : 本研究對原始交易樣本數據集進行探索性數據分析,根據交易樣本原始特徵與詐欺交易相關性,排除與詐欺交易無相關性之特徵,即支出帳戶、收款帳戶與系統性詐欺,並生成支出餘額誤差特徵,保留交易方式、交易金額、支出前餘額、支出後餘額、收款前餘額、收款後餘額、階段特徵,並選擇交易方式中包含詐欺交易樣本之現金提取及轉帳方式之交易樣本數據集作為監督式機器學習演算法之數據集。以挑選後之特徵因子建立之演算法結果如下: 邏輯斯迴歸模型AUC: 0.78,模型穩健優化後提升為AUC:0.88; Adaboost 模型AUC: 0.88,模型穩健優化後提升為AUC:0.92;隨機森林模型AUC: 0.97,模型穩健優化後提升為AUC: 0.99,並以監督式機器學習演算法排序特徵因子重要性做為本研究挑選因子之輔助依據。
結論 : 本研究使用超參數穩健優化提升演算法預測模型的準確度,建立之模型可準確判斷本研究之金融交易樣本異常詐欺,可做為警方及金融相關機構判斷異常交易檢測之協助工具。結果表明了以三種監督式機器學習演算法建立之模型在使用網格搜尋法優化後模型準確度皆有提升,由本研究之特徵分析之結果對於網路金融詐欺交易之重要特徵包含交易之時間頻率及雙方交易之餘額誤差提供警方作為參考依據。

Purpose: The research mainly apply artificial intelligence algorithms to analyze financial transaction data sets, explore the correlation between the features of financial transaction samples and fraudulent transactions, and establish a predictive model to detect potential fraud in abnormal financial transactions.
Materials and methods: The pre-processing first to exclude the missing values and transforming the variable types, through exploratory data analysis to transaction features and abnormal transaction of the data set, establish an effective risk factors data set of internet financial fraud transactions. Standardization was used for data pre-processing to optimize the prediction results of the machine learning algorithm. Establish the classification of logistic regression (LR) , adaptive Boosting(Adaboost) and random forest Model by effective risk factors data set, and use Grid search to robust optimization machine learning algorithm based on Accuracy (ACC), Area under the receiver operating characteristic curve (AUC) , ‎Positive predictive value (PPV) and Negative predictive value (NPV) assesses the accuracy of the algorithm.
Results: This research conducts exploratory data analysis on the original transaction feature data set, based on the correlation between the original characteristics of the transaction sample and the fraudulent transaction, was excluded the characteristics that are not related to the fraudulent transaction are Account, Dest and SystemFraud, establish the error characteristics of the expenditure balance, retain the transaction method, transaction amount, balance before expenditure, balance after expenditure, balance before payment, balance after payment, and step. Select the cash withdrawal and transfer methods that include fraudulent transaction samples in the transaction method. The transaction sample data set is used as the data set of the supervised machine learning algorithm. The results of the algorithm established by the selected feature factors are as follows: Logistic regression model AUC: 0.78, the model is robustly optimized to AUC: 0.88; Adaboost model AUC: 0.88, the model is robustly optimized to AUC: 0.92; random forest model AUC: 0.97, after the model is robustly optimized, it is upgraded to AUC: 0.99, and the importance of the ranking feature factors of the supervised machine learning algorithm is used as an auxiliary basis for the selection of factors in this research.
Conclusions: The study uses hyperparameter robust optimization to improve the accuracy of the algorithm's prediction model. The established model can accurately determine the abnormal fraud of the financial transaction samples, and can be used as an assisting tool for the police and financial institutions to determine abnormal transaction detection. The results show that the accuracy of the models established by the three supervised machine learning algorithms is improved after the grid search method is used to optimize the model. The results of the feature analysis of this research are important for online financial fraud transactions including the time frequency of transactions. And the balance error of the two parties’ transactions are provided by the police as a reference.


摘要 ……………………………………………………………………………………………………………………………………………………………………………..i
Abstract …………………………………………………………………………………………………………………………………………………………………..iii
致謝 ……………………………………………………………………………………………………………………………………………………………………………..v
目錄 ……………………………………………………………………………………………………………………………………………………………………………..vi
表目錄 ………………………………………………………………………………………………………………………………………………………………………..viii
圖目錄 ………………………………………………………………………………………………………………………………………………………………..ix
符號縮寫 ………………………………………………………………………………………………………………………………………………………………..x
第一章 緒論 ……………………………………………………………………………………………………………………………………………………………..1
1.1 動機 ………………………………………………………………………………………………………………………………………………………………..1
1.2 目的 ………………………………………………………………………………………………………………………………………………………………..2
1.3相關文獻探討 ………………………………………………………………………………………………………………………………………………..4
第二章 材料與方法 ……………………………………………………………………………………………………………………………………….. 6
2.1前言 …………………………………………………………………………………………………………………………………………………………………..6
2.2樣本資料 ……………………………………………………………………………………………………………………………………………………….. 8
2.3金融科技詐欺 ………………………………………………………………………………………………………………………………………………..9
2.4金融科技詐欺 ………………………………………………………………………………………………………………………………………………..9
2.5特徵因子 …………………………………………………………………………………………………………………………………………………………..11
2.6金融詐欺預測演算法 …………………………………………………………………………………………………………………………………..21
2.7 金融詐欺預測模型 ……………………………………………………………………………………………………………………………………..25
2.8金融詐欺預測模型效能評估 ………………………………………………………………………………………………………………………..28
第三章 結果 ………………………………………………………………………………………………………………………………………………………………..30
3.1前言 ………………………………………………………………………………………………………………………………………………………………………..30
3.2探索性數據分析結果 …………………………………………………………………………………………………………………………………..30
3.3總體演算法模型之效能比較 ………………………………………………………………………………………………………………………..35
第四章 討論 ……………………………………………………………………………………………………………………………………………………………..36
4.1異常交易檢測模型之分析與比較 …………………………………………………………………………………………………………..36
4.2因子分析探討 ……………………………………………………………………………………………………………………………………………………..37
4.3樣本收集 …………………………………………………………………………………………………………………………………………………………..38
第五章 結論 …………………………………………………………………………………………………………………………………………………………..41
參考文獻 …………………………………………………………………………………………………………………………………………………………………..42
自傳 ……………………………………………………………………………………………………………………………………………………………………………..46


參考文獻
[1]A. J. I. Bettinger, "Fintech: A series of 40 time shared models used at Manufacturers Hanover Trust Company," Interfaces, pp. 62-63, 1972.
[2]A. V. J. J. o. F. I. Thakor, "Fintech and banking: What do we know?," Journal of Financial Intermediation, vol. 41, p. 100833, 2020.
[3]B. Stojanović et al., "Follow the trail: machine learning for fraud detection in Fintech applications," vol. 21, no. 5, p. 1594, 2021.
[4]內政部警政署. (2021). 內政統計通報. Available: https://ws.moi.gov.tw/Download.ashx?u=LzAwMS9VcGxvYWQvNDAwL3JlbGZpbGUvOTAwOS8yMTI3MjkvNDVhZDQ1ZTMtMGZhMS00MWZhLTkwODEtOTg4Mzk2ZDY2YjIwLnBkZg%3D%3D&n=MTEw5bm056ysNumAseWFp%2BaUv%2Be1seioiOmAmuWgsV%2Fmsrvlronmg4Xli6IucGRm
[5]E. W. Ngai, Y. Hu, Y. H. Wong, Y. Chen, and X. J. D. s. s. Sun, "The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature," Decision support systems, vol. 50, no. 3, pp. 559-569, 2011.
[6]M. Ahmed, N. Choudhury, and S. Uddin, "Anomaly detection on big data in financial markets," in 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2017, pp. 998-1001: IEEE.
[7]M. Ahmed, A. N. Mahmood, and M. R. J. F. G. C. S. Islam, "A survey of anomaly detection techniques in financial domain," Future Generation Computer Systems, vol. 55, pp. 278-288, 2016.
[8]A. Abdallah, M. A. Maarof, A. J. J. o. N. Zainal, and C. Applications, "Fraud detection system: A survey," vol. 68, pp. 90-113, 2016.
[9]N. F. Ryman-Tubb, P. Krause, and W. J. E. A. o. A. I. Garn, "How Artificial Intelligence and machine learning research impacts payment card fraud detection: A survey and industry benchmark," Engineering Applications of Artificial Intelligence, vol. 76, pp. 130-157, 2018.
[10]J. West, M. J. C. Bhattacharya, and security, "Intelligent financial fraud detection: a comprehensive review," Computers security, vol. 57, pp. 47-66, 2016.
[11]H. C. Han, H. Kim, H. K. J. J. o. T. K. I. o. I. S. Kim, and Cryptology, "Fraud Detection System in Mobile Payment Service Using Data Mining," vol. 26, no. 6, pp. 1527-1537, 2016.
[12]C. S. Hilas, P. A. Mastorocostas, I. T. J. A. M. Rekanos, and I. Sciences, "Clustering of telecommunications user profiles for fraud detection and security enhancement in large corporate networks: a case study," vol. 9, no. 4, p. 1709, 2015.
[13]S. Subudhi and S. J. P. C. S. Panigrahi, "Quarter-sphere support vector machine for fraud detection in mobile telecommunication networks," vol. 48, pp. 353-359, 2015.
[14]V. S. Tseng, J.-C. Ying, C.-W. Huang, Y. Kao, and K.-T. Chen, "Fraudetector: A graph-mining-based framework for fraudulent phone call detection," in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 2157-2166.
[15]H. Y. Min, J. H. Park, D. H. Lee, I. S. J. J. o. I. C. Kim, and Services, "Outlier detection method for mobile banking with user input pattern and E-finance transaction pattern," Journal of Internet Computing, vol. 15, no. 1, pp. 157-170, 2014.
[16]J. Chen. (2021). Fraud. Available: https://www.investopedia.com/terms/f/fraud.asp
[17]Kaggle. (2020). Synthetic Data from a Financial Payment System. Available: https://www.kaggle.com/ntnu-testimon/banksim1
[18]E. Lopez-Rojas, A. Elmir, and S. Axelsson, "PaySim: A financial mobile money simulator for fraud detection," in 28th European Modeling and Simulation Symposium, EMSS, Larnaca, 2016, pp. 249-255: Dime University of Genoa.
[19]E. A. Lopez-Rojas and C. Barneaud, "Advantages of the PaySim Simulator for Improving Financial Fraud Controls," in Intelligent Computing-Proceedings of the Computing Conference, 2019, pp. 727-736: Springer.
[20]C. J. E. j. o. o. r. Chatfield, "Exploratory data analysis," vol. 23, no. 1, pp. 5-13, 1986.
[21]J. W. Tukey, Exploratory data analysis. Reading, Mass., 1977.
[22]S. J. W. I. R. C. S. Morgenthaler, "Exploratory data analysis," vol. 1, no. 1, pp. 33-44, 2009.
[23]P. Tamilarasi and R. U. Rani, "Diagnosis of Crime Rate against Women using k-fold Cross Validation through Machine Learning," in 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), 2020, pp. 1034-1038: IEEE.
[24]I. H. Witten and E. J. A. S. R. Frank, "Data mining: practical machine learning tools and techniques with Java implementations," vol. 31, no. 1, pp. 76-77, 2002.
[25]T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University, School of Computer Science, Machine Learning …, 2006.
[26]M. Svensén and C. M. Bishop, "Pattern recognition and machine learning," ed: Springer, 2007.
[27]F. A. Wichmann, N. J. J. P. Hill, and psychophysics, "The psychometric function: II. Bootstrap-based confidence intervals and sampling," vol. 63, no. 8, pp. 1314-1329, 2001.
[28]P. J. Bickel and D. A. J. T. a. o. s. Freedman, "Asymptotic normality and the bootstrap in stratified sampling," pp. 470-482, 1984.
[29]L. J. A. i. r. Rokach, "Ensemble-based classifiers," vol. 33, no. 1, pp. 1-39, 2010.
[30]R. E. Schapire, "A brief introduction to boosting," in Ijcai, 1999, vol. 99, pp. 1401-1406: Citeseer.
[31]R. J. P. P. Wang, "AdaBoost for feature selection, classification and its relation with SVM, a review," vol. 25, pp. 800-807, 2012.
[32]A. Singh and A. Jain, "Adaptive credit card fraud detection techniques based on feature selection method," in Advances in computer communication and computational sciences: Springer, 2019, pp. 167-178.
[33]I. Benchaji, S. Douzi, and B. El Ouahidi, "Using genetic algorithm to improve classification of imbalanced datasets for credit card fraud detection," in International Conference on Advanced Information Technology, Services and Systems, 2018, pp. 220-229: Springer.
[34]J. M. Hilbe, Logistic regression models. Chapman and hall/CRC, 2009.
[35]S. Chen, Y.-J. J. Goo, and Z.-D. J. T. S. W. J. Shen, "A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements," vol. 2014, 2014.
[36]J. Luengo, A. Fernández, S. García, and F. J. S. C. Herrera, "Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling," vol. 15, no. 10, pp. 1909-1936, 2011.
[37]N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. J. J. o. a. i. r. Kegelmeyer, "SMOTE: synthetic minority over-sampling technique," vol. 16, pp. 321-357, 2002.
[38]E. Tung. (2019). SMOTE + ENN : 解決數據不平衡建模的採樣方法. Available: https://medium.com/%E6%95%B8%E5%AD%B8-%E4%BA%BA%E5%B7%A5%E6%99%BA%E6%85%A7%E8%88%87%E8%9F%92%E8%9B%87/smote-enn-%E8%A7%A3%E6%B1%BA%E6%95%B8%E6%93%9A%E4%B8%8D%E5%B9%B3%E8%A1%A1%E5%BB%BA%E6%A8%A1%E7%9A%84%E6%8E%A1%E6%A8%A3%E6%96%B9%E6%B3%95-cdb6324b711e
[39]J. Y. Hesterman, L. Caucci, M. A. Kupinski, H. H. Barrett, and L. R. J. I. t. o. n. s. Furenlid, "Maximum-likelihood estimation with a contracting-grid search algorithm," vol. 57, no. 3, pp. 1077-1084, 2010.
[40]P. Probst, M. N. Wright, A. L. J. W. I. R. D. M. Boulesteix, and K. Discovery, "Hyperparameters and tuning strategies for random forest," vol. 9, no. 3, p. e1301, 2019.
[41]D. Paper and D. J. H.-o. S.-L. f. M. L. A. D. S. F. w. P. Paper, "Scikit-Learn Classifier Tuning from Simple Training Sets," pp. 137-163, 2020.
[42]J. Bergstra and Y. J. J. o. m. l. r. Bengio, "Random search for hyper-parameter optimization," vol. 13, no. 2, 2012.
[43]J. Huang, C. X. J. I. T. o. k. Ling, and D. Engineering, "Using AUC and accuracy in evaluating learning algorithms," vol. 17, no. 3, pp. 299-310, 2005.
[44]J. M. Lobo, A. Jiménez‐Valverde, R. J. G. e. Real, and Biogeography, "AUC: a misleading measure of the performance of predictive distribution models," vol. 17, no. 2, pp. 145-151, 2008.
[45]A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, "Performance measures," in Learning from Imbalanced Data Sets: Springer, 2018, pp. 47-61.
[46]J. H. Park, H. K. Kim, E. J. J. o. T. K. I. o. I. S. Kim, and Cryptology, "Effective normalization method for fraud detection using a decision tree," vol. 25, no. 1, pp. 133-146, 2015.
[47]C. Liu, Y. Chan, S. H. Alam Kazmi, H. J. I. j. o. e. Fu, and finance, "Financial fraud detection model: Based on random forest," vol. 7, no. 7, 2015.
[48]M. Vadoodparast, A. R. J. I. J. o. C. S. Hamdan, and I. Security, "Fraudulent electronic transaction detection using dynamic kda model," vol. 13, no. 3, p. 90, 2015.
[49]V. Patil, U. K. J. I. J. o. S. R. i. C. S. Lilhore, Engineering, and I. Technology, "A survey on different data mining & machine learning methods for credit card fraud detection," vol. 3, no. 5, pp. 320-325, 2018.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊