(3.238.174.50) 您好!臺灣時間:2021/04/18 01:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳柏安
研究生(外文):Chen, Bor-An
論文名稱:利用自動編碼器自動萃取殭屍網路特徵以偵測殭屍網路
論文名稱(外文):Automatically Extract Botnet Features Using an Autoencoder to Detect Botnets
指導教授:曾文貴曾文貴引用關係
指導教授(外文):Tzeng, Wen-Guey
口試委員:謝續平蔡錫鈞曾文貴孫宏民
口試委員(外文):Shieh, Shiuh-pyngTsai, Shi-ChunTzeng, Wen-GueySun, Hung-Min
口試日期:2017-08-27
學位類別:碩士
校院名稱:國立交通大學
系所名稱:網路工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:107
語文別:中文
論文頁數:51
中文關鍵詞:殭屍網路機器學習類神經網路特徵萃取監督式學習異常偵測
外文關鍵詞:BotnetMachine-learningNeural networkFeature extractionSupervise learningAnomaly detection
相關次數:
  • 被引用被引用:1
  • 點閱點閱:247
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:81
  • 收藏至我的研究室書目清單書目收藏:0
殭屍網路一直在網路惡意攻擊扮演很重要的角色,像是利用被控制的電腦發送垃圾郵件還有發動分散式阻斷服務攻擊(DDoS),近幾年才有一個很有名的攻擊事件,駭客利用殭屍網路控制大量的物聯網裝置發動DDoS攻擊,使得知名網站的服務受到阻斷。在過往的研究中已經有很多學者在殭屍網路這個領域從事相關的研究,早期主要是透過特徵比對方式來找出是否有被殭屍網路感染,後來則是利用機器學習的監督式學習來偵測殭屍網路,但是這需要研究者有相關的背景知識,藉由觀察殭屍網路提出好的特徵進而增加偵測率。
有鑑於這種情形,我們提出一個全自動特徵萃取的方法,在大量的特徵集中藉由自動編碼器(autoencoder)萃取出我們想要的特徵數量,再放入分類器做學習。我們可以在不同資料集達到最高99.6%的準確度。這不僅能減去研究者在找出特徵所需要做的努力,也能將原有的特徵集維度減少,另外訓練自動編碼器資料是不需要有標籤的,可以利用沒有標籤的資料將自動編碼器訓練的更完善。最後我們也利用自動編碼器可以利用無標籤資料的特性,只利用一般網路流建立模型,做到異常偵測。
Botnet is one of the major threats on the Internet for cybercrimes, such as spreading spams, DDoS attack, etc. In 2016, there was a famous cybercrime by using botnet. The hacker using botnet to control lots of IOT devices launched a DDoS attack. This event made some famous network service interrupted. In the past years, there are many researcher work on botnet detection. Early, researchers focus on signature-based botnet detection. In recent years, researchers use machine learning technique like supervised learning to detect botnet. When researchers want to using supervised learning technique to detect botnet, they need to familiar with botnet and analyze the botnet dataset so they can propose the effective feature.
We propose an automate botnet feature extraction method. This method can extract features from a large feature set by using autoencoder and train a classifier. With this method, we can achieve an accuracy of up to 99.6% in different data sets. Our method not only can subtract the researcher’s efforts to find effective features, can also reduce the original feature set dimension. In addition, the training of autoencoder data does not need to be labeled, and the autoencoder training can be improved with unlabeled data. Finally, we also use the autoencoder to take advantage of the characteristics of the unlabeled data and only use the general network flow to build the model to achieve anomaly detection.
目錄 i
圖目錄 iii
表目錄 v
第一章 介紹 1
1.1. 背景介紹 1
1.1.1. 集中式殭屍網路架構 2
1.1.2. Peer to Peer殭屍網路架構 3
1.1.3. Domain Generation Algorithm(DGA) Botnet 4
1.2.研究動機 5
1.3.貢獻 5
1.4.全文架構 6
第二章 相關研究 7
2.1. Signature-based Detection 7
2.2. 殭屍網路流特徵研究 7
2.3. 監督式學習偵測殭屍網路 8
2.4. 自動編碼器相關研究 9
2.5. 特徵提取 10
第三章 演算法介紹 11
3.1.類神經網路 11
3.2.自動編碼器 12
3.3.支援向量機 13
3.4.Kmeans 演算法 14
第四章 系統設計 15
4.1.總覽 15
4.2.資料前處理階段 16
4.2.標準化階段 18
4.3.自動編碼器訓練階段 18
4.4.支援向量基測試與訓練階段 21
4.5.基於熵的計算之模型選擇 21
4.6.圖形化殭屍網路偵測軟體整合階段 23
4.7.自動編碼器之異常偵測 23
第五章 系統實作 25
5.1.使用工具介紹 25
5.1.1. scikit-learn 25
5.1.2. tensorflow 25
5.1.3. keras 25
5.2.程式流程 26
5.2.1. Pcap轉換成網路流 26
5.2.2. 網路流特徵計算 27
5.2.3. 特徵向量標準化 29
5.2.4. 自動編碼器前置作業 30
5.2.5. 自動編碼器訓練 31
5.2.6. 分類器訓練與測試 32
5.2.7. 異常偵測 33
5.2.8. 利用熵來做模型選擇 35
5.2.9. 圖形化殭屍網路偵測軟體 35
第六章 實驗結果與討論 37
6.1. 資料集介紹 37
6.2. 評估指標介紹 38
6.3. 實驗環境 39
6.4. 實驗1:與不同論文做比較 39
6.5. 實驗2:參數選擇 42
6.5.1. 隱藏層數與各層點數 43
6.5.2. 激活函數選擇 44
6.5.3. 自動編碼器使用於新的資料集 45
6.6. 實驗3:與主成分分析做比較 46
6.6. 實驗4:與未經過自動編碼器特徵集比較 46
第七章 結論 48
參考資料 49
[1] Aymen Hasan Rashid Al Awadi, Bahari Belaton. Multi-phase IRC Botnet and Botnet Behavior Detection Model. International Journal of Computer Applications (0975 – 8887) Volume 66– No.15, 2013.
[2] Julian B. Grizzard, Vikram Sharma, Chris Nunnery, Brent B. Kang. Peer-to-Peer Botnets: Overview and Case Study. First Workshop on Hot Topics in Understanding Botnets, 7: 1-1, 2007.
[3] Dilara Acarali, Muttukrishnan Rajarajan, Nikos Komninos, Ian Herwono, Survey of approaches and features for the identification of HTTP-based botnet traffic, In Journal of Network and Computer Applications, p. 1-15,2016.
[4] Tao Wang, Shun-Zheng Yu. Centralized Botnet Detection by Traffic Aggregation. IEEE International Symposium on Parallel and Distributed Processing with Applications, p. 86-93, 2009.
[5] Wei Zhang, Yue-Ji Wang, Xiao-Lei Wang. A Survey of Defense Against P2P Botnets. IEEE 12th International Conference on Dependable, Autonomic and Secure Computing, p. 97-102, 2014.
[6] Sunny Behal, Amanpreet S. Brar, Krishan Kuma. Signature-based Botnet Detection and Prevention. International Symposium on Computer Engineering and Technology, p. 127–132, 2010.
[7] Maryam Feily, Alireza Shahrestani. A Survey of Botnet and Botnet Detection. Third International Conference on Emerging Security Information, Systems and Technologies, p. 268-273, 2009.
[8] "Snort," Available: https://www.snort.org/.
[9] Saiyan Saiyod, Youksamay Chanthakoummane, Nunnapus Benjamas, Nattawat Khamphakdee, Jirayus Chaichawananit. Improving Intrusion Detection on Snort Rules for Botnet Detection. Software Networking Vol: 2016 Issue: 1, p. 765-779, 2018.
[10] Guofei Gu, Phillip A. Porras, Vinod Yegneswaran, Martin W. Fong. BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation. 16th USENIX Security Symposium, p. 1-16, 2007.
[11] Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection. 17th USENIX Security Symposium, p. 139, 2008.
[12] Elaheh B. Beigi, Hossein H. Jazi, Natalia Stakhanova Ali A. Ghorbani. Towards effective feature selection in machine learning-based botnet detection approaches. IEEE Conference on Communications and Network Security, p. 247-255, 2014.
[13] G. Kirubavathi, R. Anitha. Botnet detection via mining of traffic flow characteristics. Computers & Electrical Engineering 50: 91-101, 2016.
[14] Wen-Hwa Liao, Wen-Hwa Liao. Peer to Peer Botnet Detection Using Data Mining Scheme. In IEEE Conference on Internet Technology and Applications, p. 1-4, 2010.
[15] Sherif Saad, Issa Traoré, Ali A. Ghorbani, Bassam Sayed, David Zhao, Wei Lu, John Felix, Payman Hakimian. P2P botnets through network behavior analysis and machine learning. IEEE Ninth Annual Conference on Privacy, Security and Trust, p. 174-180, 2011.
[16] David Zhao, Issa Traore, Bassam Sayed, Wei Lu, Sherif Saad, Ali Ghorbani, Dan Garant. Botnet detection based on traffic behavior analysis and flow intervals. Computers & Security 39. p. 2-16, 2013.
[17] Geoffrey E. Hinton, Ruslan R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks.VOL 313 SCIENCE, p. 504-507, 2006.
[18] Andrew Ng. Sparse autoencoder. https://web.stanford.edu/class/cs294a/sparseAutoencoder
[19] Mayu Sakurada, Takehisa Yairi. Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction. Workshop on Machine Learning for Sensory Data Analysis, p. 4, 2014.
[20] Pratik Narang, Jagan M. Reddy, Chittaranjan Hota. Feature Selection for Detection of Peer-to-Peer Botnet Traffic. 6th ACM India Computing Convention, COMPUTE 2013, p. 16, 2013.
[21] Warren. S. McCulloch and Walter Pitts, A logical calculus of the ideas immanent in nervous activity. Bulletin of mathematical biology, 52.1-2: 99-115, 1990.
[22] I. Guyon, G. Dror, V. Lemaire, G. Taylor and D. Silver. Autoencoders, Unsupervised Learning, and Deep Architectures. http://proceedings.mlr.press/v27/baldi12a/baldi12a.pdf.
[23] Corinna Cortes, Vladimir Vapnik. Support-vector networks. In Machine Learning, p. 273-297, 1995
[24] Arti Patle, Deepak S. Chouhan. SVM kernel functions for classification. International Conference on Advances in Technology and Engineering, p. 1-9, 2013.
[25] Manpreet Kaur, Usvir Kaur. A Survey on Clustering Principles with K-means Clustering Algorithm Using Different Methods in Detail. International Journal of Computer Science and Mobile Computing, Vol. 2, Issue. 5, p.327 – 331, 2013.
[26] Unnati R. Raval, Chaita Jani. Implementing and Improvisation of K-means Clustering. International Journal of Computer Science and Mobile Computing, Vol. 4, Issue. 11, p.72 – 76, 2015.
[27] Payam Refaeilzadeh, Lei Tang, Huan Liu. Cross-Validation. http://leitang.net/papers/ency-cross-validation.pdf.
[28] Taoying Li, Yan Chen. A Weight Entropy k-Means Algorithm for Clustering Dataset with Mixed Numeric and Categorical Data. 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, p. 36-41, 2008.
[29] Tkinter. https://docs.python.org/3/library/tkinter.html.
[30] Scikit-learn. http://scikit-learn.org/stable/.
[31] Python. https://www.python.org.
[32] Tensorflow. https://www.tensorflow.org
[33] Keras. https://keras.io.
[34] Theano. http://deeplearning.net/software/theano/.
[35] Yuan-Hsiang Su, Amir Rezapour, Wen-Guey Tzeng. The forward-backward string: A new robust feature for botnet detection. IEEE Conference on Dependable and Secure Computing, p. 485-492, 2017.
[36] Wireshark. https://www.wireshark.org.
[37] ISOT. https://www.uvic.ca/engineering/ece/isot/
[38] ISCX. http://www.unb.ca/cic/datasets/index.html
[39] Sebastian García, M. Grill, J. Stiborek and A. Zunino, An empirical comparison of botnet detection methods, Computers & Security, p. 100-123,2014.
[40] Ian T. Jolliffe. Principal Component Analysis, Second Edition. http://cda.psych.uiuc.edu/statistical_learning_course/Jolliffe%20I.%20Principal%20Component%20Analysis%20%282ed.,%20Springer,%202002%29%28518s%29_MVsa_.pdf
[41] Deep autoencoder. https://blog.csdn.net/yang_mang/article/details/78003938
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔