跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.121) 您好!臺灣時間:2025/12/11 11:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:DmitriiVladimirovichMatveichev
研究生(外文):Dmitrii Matveichev
論文名稱:基於詞彙特徵之算法生成惡意網域名稱偵測
論文名稱(外文):Detection of algorithmically generated malicious domain names based on lexical features
指導教授:曾俊元
指導教授(外文):TSENG, CHIN-YANG
口試委員:曾俊元莊東穎曹偉駿黃俊穎
口試委員(外文):TSENG, CHIN-YANGJUANG, TONG-YINGTSAUR, WOEI-JUINNHUANG, JUINN-YING
口試日期:2018-07-18
學位類別:碩士
校院名稱:國立臺北大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:英文
論文頁數:43
中文關鍵詞:殭屍網絡dga檢測詞法分析C4.5物聯網安全
外文關鍵詞:botnetsdga detectionlexical analysisC4.5internet of things security
相關次數:
  • 被引用被引用:0
  • 點閱點閱:258
  • 評分評分:
  • 下載下載:25
  • 收藏至我的研究室書目清單書目收藏:0
The latest threat reports show a notable increase in detected botnets compared to previous years. In fact, the number of IoT botnet C&C controllers alone more than doubled in 2017. Botnet C&C controllers are used by cybercriminals to launch attacks using botnet enslaved devices. As showed success of Mirai botnet, a lot of companies use poorly secured IoT devices, which gives an opportunity for using IoT devices as botnet zombies.
To avoid detection, botnets use domain generation algorithms (DGA) to connect to C&C servers via large number of domain names. This work proposes a low-cost strategy to detect domain names generated by DGA. Statistically lexical features of domain names generated algorithmically differ from those generated by humans. Thus, algorithmically generated domain names can be used to detect botnet or malware activity in the network.
To justify the choice of lexical features we gathered domain names statistics of 32 botnets that appeared in last 8 years. Lexical features were chosen based on gathered statistics and new lexical features were suggested. Chosen lexical features were used to generate a decision tree by means of C4.5 algorithm.
Experimental results show, that suggested new lexical features improve detection accuracy. 93.7% detection accuracy was achieved. Detection algorithm based on the generated decision tree can be used for fast real-time detection of botnet domain names.

The latest threat reports show a notable increase in detected botnets compared to previous years. In fact, the number of IoT botnet C&C controllers alone more than doubled in 2017. Botnet C&C controllers are used by cybercriminals to launch attacks using botnet enslaved devices. As showed success of Mirai botnet, a lot of companies use poorly secured IoT devices, which gives an opportunity for using IoT devices as botnet zombies.
To avoid detection, botnets use domain generation algorithms (DGA) to connect to C&C servers via large number of domain names. This work proposes a low-cost strategy to detect domain names generated by DGA. Statistically lexical features of domain names generated algorithmically differ from those generated by humans. Thus, algorithmically generated domain names can be used to detect botnet or malware activity in the network.
To justify the choice of lexical features we gathered domain names statistics of 32 botnets that appeared in last 8 years. Lexical features were chosen based on gathered statistics and new lexical features were suggested. Chosen lexical features were used to generate a decision tree by means of C4.5 algorithm.
Experimental results show, that suggested new lexical features improve detection accuracy. 93.7% detection accuracy was achieved. Detection algorithm based on the generated decision tree can be used for fast real-time detection of botnet domain names.

Contents
ABSTRACT 1
LIST OF FIGURES 4
LIST OF TABLES 5
Chapter 1 Introduction 6
1.1 Introduction 6
1.2 Problem statement 7
1.3 Objective 8
1.4 Organization 9
Chapter 2 Preliminaries 10
2.1 Related work 10
2.2 DGA and botnets theory 10
2.2.1 DGA 13
2.3 C4.5 14
Chapter 3 Methodology 15
3.1 Domain names lists 15
3.1.1 White list 15
3.1.2 Black list 15
3.2 Lexical features 17
3.2.1 Symbolic lexical features 18
3.2.2 Non-symbolic lexical features 19
3.2.3 Non-vowel lexical features 19
3.3 Lexical features statistics 20
3.3.1 Histograms calculation 20
3.3.2 Histograms comparison 20
3.4 Detection model 21
3.4.1 Data preprocessing 21
3.4.2 Lexical feature extraction 22
3.4.3 Decision tree generation 22
Chapter 4 Results 24
4.1 Lexical features statistics 24
4.1.1 Symbolic lexical features 24
4.1.2 Non-symbolic lexical features 30
4.1.3 Non-vowel lexical features 33
4.2 Statistics discussions 33
4.3 Detection accuracy with 9 good lexical features 34
4.4 Detection accuracy with all features 36
4.5 Detection accuracy for all data 37
4.6 Results discussion 37
Chapter 5 Conclusion 39
References 40


[1]E. Stalmans, S.O. Hunter, and B. Irwin, “Geo-spatial autocorrelation as a metric for the detection of Fast-Flux botnet domains”, Information Security for South Africa (ISSA), pp. 1- 7,2012
[2] G. Gu, P. Porras, V. Yegneswaran, M. Fong and W. Lee, “Bothunter: Detecting malware infection through ids-driven dialog correlation”. USENIX Security Symposium, 2007
[3]G. Gu, J. Zhang and W. Lee, “BotSniffer: Detecting botnet command and control channels in network traffic”, Network and Distributed System Security Symposium, 2008
[4]C. ChiaMei, O. YaHui, and T. YuChou, ”Web botnet detection based on flow information” International Computer Symposium (ICS), pp.381 – 384, 2010
[5]D. Mahjoub and H. ElRewini, ”Adaptive Constraint-Based Multi-Objective Routing for Wireless Sensor Networks”, IEEE International Conference on Pervasive Services, pp.72-75, 2007
[6]S. L. Salzberg, “C4.5: Programs for Machine Learning” by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Machine Learning, vol. 16, no. 3, pp. 235-240, 1994
[7]R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection”. International Joint Conference on Artificial Intelligence (IJCAI), vol.14, no. 2, pp. 1137-1143, 1995
[8]S. Yadav, A. K. K. Reddy, A. L. N. Reddy, and S. Ranjan, “Detecting algorithmically generated domain-flux attacks with DNS traffic analysis”, IEEE/ACM Trans. Netw., vol. 20, no. 5, pp. 1663–1677, Oct. 2012.
[9]M. Antonakakis, R. Perdisci, Y. Nadji, N. Vasiloglou, S. Abu-Nimeh, W. Lee, and D. Dagon, “From throw-away traffic to bots: detecting the rise of dga-based malware” in Proceedings of the 21st USENIX conference on Security symposium, ser. Security’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 24–24
[10]http://s3-us-west-1.amazonaws.com/umbrella-static/index.html
[11]http://data.netlab.360.com/dga/
[12]https://abuse.ch/
[13] M. Dooley, T. Rooney, "DNS Security Management", IEEE press, 2017
[14] Twitter api still attracts hackers. http://blog.unmaskparasites.com/2009/12/09/ twitter-api-still-attracts-hackers/.
[15] B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer, C. Kruegel, and G. Vigna. “Your botnet is my botnet: Analysis of a botnet takeover”, ACM Conference on Computer and Communications Security (CCS), 2009.
[16] D. Andriesse, C. Rossow, B. Stone-Gross, D. Plohmann, and H. Bos, "Highly Resilient Peer-to-Peer Botnets Are Here: An Analysis of Gameover Zeus", 2013 8th International Conference on Malicious and Unwanted Software: "The Americas"
[17] Ian H.Witten, Eibe Frank, Mark A. Hall, "Data Mining Practical Machine Learning Tools and Techniques", 3rd Edition, 2011
[18] Schiavoni S., Maggi F., Cavallaro L., Zanero S. “Phoenix: DGA-Based Botnet Tracking and Intelligence.” Dietrich S. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2014.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top