( 您好!臺灣時間:2021/03/04 15:19
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):Yu-Jing Lin
論文名稱(外文):An Evaluation of Bitcoin Address Classification based on Transaction History Summarization
指導教授(外文):Shih-wei Liao
口試委員(外文):I-Ping Tu
  • 被引用被引用:0
  • 點閱點閱:60
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0

在本論文中,我們提出有別於過去文獻經常使用的特徵,來建構偵測行為異常之比特幣地址的分類模型。我們發現數個相當有效的特徵,稱為「額外統計特徵」,與「基本統計特徵」作區別。此外,我們還提出全新的特徵:高階矩與十分位數,能夠有效地捕捉一個地址之交易紀錄中的時間資訊。我們將有數據標註的比特幣地址資料集,透過這些方法取出特徵後,由監督式學習的機器學習演算法訓練分類模型。實驗的結果顯示我們提出的特徵對於比特幣地址分類的準確率有顯著的提升。我們衡量了八種分類演算法後,最佳的結果來自基於梯度提升決策樹的演算法,在 Micro-F1 分數與 Macro-F1 分數上皆達到87%。
Bitcoin is a cryptocurrency that features a distributed and decentralized mechanism, which has made Bitcoin a popular global transaction platform. The transaction efficiency among nations and the privacy benefiting from address anonymity of the Bitcoin network have attracted many activities such as payments, investments, gambling, and even money laundering in the past decade. Unfortunately, some criminal behaviors which took advantage of this platform were not identified. This has discouraged many governments to support cryptocurrency. Thus, the capability to identify criminal addresses becomes an important issue in the cryptocurrency network.

In this paper, we propose new features in addition to those commonly used in the literature to build a classification model for detecting abnormality of Bitcoin network addresses. We found several useful conventional features, which we name as extra statistics. Also, we introduce new features includ- ing various high orders of moments of transaction time (represented by block height) and deciles of transaction time which summarize temporal informa- tion of the transaction history in an efficient way. The extracted features are trained by supervised machine learning methods on a labelled dataset of Bit- coin addresses. The experimental evaluation shows that these features have improved the performance of Bitcoin address classification significantly. We evaluate the results under eight classifiers and achieve the highest Micro-F1 / Macro-F1 of 87% / 87% with a gradient boosting decision tree algorithm.
誌謝 iii
Acknowledgements iv
摘要 v
Abstract vi
1 Introduction 1
2 Bitcoin Network 4
3 Related Work 6
4 Proposed Method 9
4.1 ConventionalBitcoinAddressFeatures 10
4.2 BasicStatistics 10
4.2.1 ExtraStatistics 10
4.2.2 TemporalInformation 11
4.2.3 TransactionMoments 12
4.2.4 TransactionDeciles(10-Quantiles) 14
4.2.5 AnExampleofMomentsandDeciles 16
5 Experiments 18
5.1 CollectData 18
5.2 SummarizeTransactionHistories 19
5.3 TrainClassifiers 24
5.4 ImplementationDetails 24
6 Evaluation and Discussion 27
6.1 SupervisedClassifiers 27
6.2 FeatureTypes 29
6.3 ConfusionMatrix 31
6.4 ImportantFeatures 32
6.5 TransactionNumbers 34
6.6 InsightsoftheNeuralNetwork 37
7 Conclusion 39
Bibliography 40
Appendices 1
A Feature Importance 1
B Experiment Results 4
B.1 Experiments of Different Classification Algorithms 4
B.2 ExperimentsofAblationStudy 9
B.3 ExperimentsoftheNeuralNetwork 12
B.4 ExperimentsofCost-SensitiveLearning 14
[1] chainalysis.com: Chainalysis - blockchain analysis.
[2] coinmarketcap.com: Cryptocurrency market capitalizations.
[3] E. Androulaki, G. O. Karame, M. Roeschlin, T. Scherer, and S. Capkun. Evaluating user privacy in bitcoin. In International Conference on Financial Cryptography and Data Security, pages 34–51. Springer, 2013.
[4] M.Bartoletti,B.Pes,andS.Serusi.Dataminingfordetectingbitcoinponzischemes. In 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), pages 75– 84, June 2018.
[5] L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
[6] L. S. Burks, A. E. Cox, K. Lakkaraju, M. J. Boyd, and E. Chan. Bitcoin address classification. Technical report, Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2017.
[7] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: syn- thetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357, 2002.
[8] T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794. ACM, 2016.
[9] F. Chollet et al. Keras, 2015.
[10] M. Conti, S. Kumar, C. Lal, and S. Ruj. A survey on security and privacy issues of bitcoin. IEEE Communications Surveys & Tutorials, 2018.
[11] J. DuPont and A. C. Squicciarini. Toward de-anonymizing bitcoin by mapping users location. In Proceedings of the 5th ACM Conference on Data and Application Se- curity and Privacy, pages 139–141. ACM, 2015.
[12] Y.FreundandR.E.Schapire.Adecision-theoreticgeneralizationofon-linelearning and an application to boosting. Journal of computer and system sciences, 55(1):119– 139, 1997.
[13] I. Grigg. Eos, an introduction. Whitepaper) ” https://eos.io/documents/EOS_An_Introduction.pdf, 2017.
[14] M.A.Harlev,H.SunYin,K.C.Langenheldt,R.Mukkamala,andR.Vatrapu.Break- ing bad: De-anonymising entity types on the bitcoin blockchain using supervised machine learning. In Proceedings of the 51st Hawaii International Conference on System Sciences, 2018.
[15] T. Hastie, S. Rosset, J. Zhu, and H. Zou. Multi-class adaboost. Statistics and its Interface, 2(3):349–360, 2009.
[16] S. Haykin and N. Network. A comprehensive foundation. Neural networks, 2(2004):41, 2004.
[17] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998.
[18] M. Jourdan, S. Blandin, L. Wynter, and P. Deshpande. Characterizing entities in the bitcoin blockchain. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pages 55–62, Nov 2018.
[19] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu. Light- gbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems, pages 3146–3154, 2017.
[20] R.Koenker.Quantile regression for longitudinal data.Journal of Multivariate Analysis, 91(1):74–89, 2004.
[21] R. Koenker and G. Bassett Jr. Regression quantiles. Econometrica: journal of the Econometric Society, pages 33–50, 1978.
[22] R. Koenker and K. F. Hallock. Quantile regression. Journal of economic perspec- tives, 15(4):143–156, 2001.
[23] Y. Lin, P. Wu, C. Hsu, I. Tu, and S. Liao. An evaluation of bitcoin address classification based on transaction history summarization. In 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pages 302–310, May 2019.
[24] G. Louppe, L. Wehenkel, A. Sutera, and P. Geurts. Understanding variable importances in forests of randomized trees. In Advances in neural information processing systems, pages 431–439, 2013.
[25] J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA, 1967.
[26] S. Meiklejohn, M. Pomarole, G. Jordan, K. Levchenko, D. McCoy, G. M. Voelker, and S. Savage. A fistful of bitcoins: characterizing payments among men with no names. In Proceedings of the 2013 conference on Internet measurement conference, pages 127–140. ACM, 2013.
[27] P. Monamo, V. Marivate, and B. Twala. Unsupervised learning for robust bitcoin fraud detection. In Information Security for South Africa (ISSA), 2016, pages 129– 134. IEEE, 2016.
[28] M. Moser. Anonymity of bitcoin transactions. 2013.
[29] K. Nagata, H. Kikuchi, and C.-I. Fan. Risk of bitcoin addresses to be identified from features of output addresses. In 2018 IEEE Conference on Dependable and Secure Computing (DSC), pages 1–6. IEEE, 2018.
[30] S. Nakamoto. Bitcoin: A peer-to-peer electronic cash system. 2008.
[31] V. R. Patil, A. P. Nikam, J. S. Pawar, and M. S. Pardhi. Bitcoin fraud detection using data mining approach. Journal of Information Technology and Sciences, 4(2), 2018.
[32] F.Pedregosa,G.Varoquaux,A.Gramfort,V.Michel,B.Thirion,O.Grisel,M.Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830, 2011.
[33] S. Ranshous, C. A. Joslyn, S. Kreyling, K. Nowak, N. F. Samatova, C. L. West, and S. Winters. Exchange pattern mining in the bitcoin transaction directed hypergraph. In International Conference on Financial Cryptography and Data Security, pages 248–263. Springer, 2017.
[34] D. Ron and A. Shamir. Quantitative analysis of the full bitcoin transaction graph. In International Conference on Financial Cryptography and Data Security, pages 6–24. Springer, 2013.
[35] F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386–408, 1958.
[36] R. E. Schapire. A brief introduction to boosting. In Proceedings of the 16th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI’99, pages 1401–1406, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.
[37] K. Toyoda, T. Ohtsuki, and P. T. Mathiopoulos. Identification of high yielding investment programs in bitcoin via transactions pattern analysis. In GLOBECOM 2017-2017 IEEE Global Communications Conference, pages 1–6. IEEE, 2017.
[38] K. Toyoda, T. Ohtsuki, and P. T. Mathiopoulos. Multi-class bitcoin-enabled service identification based on transaction history summarization. In 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pages 1153–1160, July 2018.
[39] G. Wood. Ethereum: A secure decentralised generalised transaction ledger. Ethereum project yellow paper, 151:1–32, 2014.
[40] H. S. Yin and R. Vatrapu. A first estimation of the proportion of cybercriminal entities in the bitcoin ecosystem using supervised machine learning. In Big Data (Big Data), 2017 IEEE International Conference on, pages 3690–3699. IEEE, 2017.
[41] D. Zambre and A. Shah. Analysis of bitcoin network dataset for fraud. Unpublished Report, 2013.
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔