跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.176) 您好!臺灣時間:2025/09/07 21:57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:田偉安
研究生(外文):Wei-An Tien
論文名稱:基於文本分析之惡意程式分類系統
論文名稱(外文):Malware Classification via n-gram Analysis
指導教授:游家牧
指導教授(外文):Chia-Mu Yu
口試委員:張經略周耀新
口試委員(外文):Ching-Lueh ChangYao-Hsin Chou
口試日期:2015-11-14
學位類別:碩士
校院名稱:元智大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:中文
論文頁數:16
中文關鍵詞:惡意程式分析機器學習特徵擷取資訊獲利字元陣列
外文關鍵詞:Malware analysisfeature selectioninformation gainn-gramMachine Learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:422
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
由於惡意程式數量急遽增加,為了要辨識惡意程式,發展出靜態及動態惡意程式分析方法,根據資料來源種類,本研究採取靜態分析方法,而分類對象為轉譯過後的組合語言及二進位檔。
首先觀察資料內容,來擷取有興趣的特徵,例如組合語言指令、函式庫名稱、跳躍位置等等,並對二進位檔進行字元陣列的特徵擷取,將收集過後的特徵,透過資訊獲利演算法,來給予權重,選取重要特徵。
最後以機器學習的方式,讓系統經由訓練資料,了解各類別的惡意程式,學習各類惡意程式的特徵,再透過半監督學習的方式,模擬出更多的訓練集樣本,藉此加強分類效果,並以Xgboost機器學習工具[20],加快系統分析惡意程式的速度,實驗結果顯示,本系統對於惡意程式分類準確率為百分之八十五。

This thesis aims at developing an efficient automatic malware classification system based on static analysis. We collected 21,746 malware samples from Microsoft Malware Classification Challenge (BIG 2015). Each sample contains binary code and assembly code. To build up our classification system, we use n-gram as a means to capture malware characteristics. More specifically, malware features include Dynamic Link Library (DLL), Function Calls, Assembly code, Binary code. Then, information gain is used to select distinguishing features. Our proposed system is able to automatically classify know malware with similar features and unknown malware. Our experimental results show that our proposed system for malware classification may achieve 85% accuracy.
第一章 導論 1
1.1研究背景與動機 1
1.2 章節介紹 2
第二章 相關研究 2
第三章 系統架構 3
第四章 特徵蒐集 4
4.1 二進位特徵蒐集 4
4.2 二進位特徵選取 5
4.3 組合語言特徵蒐集 8
4.4 動態連結庫特徵蒐集 8
4.5 跳躍地址特徵蒐集 9
4.6 其他特徵蒐集 9
第五章 分類 9
5.1 模型描述 9
5.2 特徵向量計算 10
5.3 分類演算法 10
5.4 整體學習 10
5.5半監督學習 11
5.6 Xgboost 13
第六章 實驗及結果 13
6.1 實驗環境 13
6.2 實驗結果 14
6.3 實驗比較 14
第七章 結論及未來展望 15
參考文獻 15

[1]Nick Hnatiw, Tom Robinson, Casey Sheehan, Nick Suan,” Pimp My PE: Parsing Malicious and Malformed Executables”
[2]Schultz, M., Eskin, E., Zadok, E. ,”MEF Malicious email filter,a UNIX mail filter that detects malicious windows executables.” in Proc. of the USENIX annual technical conference—FREENIX track (pp. 245–252).
[3]Schultz, M., Eskin, E., Zadok, E., and Stolfo, S.”Data mining methods for detection of new malicious executables.” in Proc. of the IEEE symposium on security and privacy (pp. 178–184).
[4]Qinghua Z., Douglas S” Metaaware: Identifying metamorphic malware,” in ACSAC, 2007, pp. 411–420.
[5]L. Nataraj, S. Karthikeyan, G Jacob, and BB. Manjunath,“Malware Images: Visualization and Automatic Classification” in Proceedings of the 8th International Symposium on Visualization for Cyber Security. ACM,2011, P.4
[6]Tzu-Yen W.,Chin-Hsiung W.,Chu-Cheng H.” A Virus Prevention Model Based on Static Analysis and Data Mining Methods” in CIT Workshops 2008
[7]Mohammad M. Masud, Latifur Khan, and B.Thuraisingham.” A scalable multi-level feature extraction technique to detect malicious executables.” In Information Systems Frontiers 10(1):33-45 (2008)
[8]Mohammad M. Masud, Latifur Khan, and B.Thuraisingham.” A hybrid model to detect malicious executables.” in Proc. of the IEEE international conference on communication (ICC’07) (pp. 1443–1448)
[9]Mohammad M. Masud, Latifur Khan, and B.Thuraisingham.” Feature based techniques for auto-detection of novel email worms.” in Proc. of the eleventh Pacific-Asia conference on knowledge discovery and data mining (PAKDD’07) (pp. 205–216).
[10]Lakhotia, A., Kumar, E. U., and Venable, M. “A method for detecting obfuscated calls in malicious binaries.” in IEEE Transactions on Software Engineering, 31(11), 955–968.
[11]Christopher K., William R., Fredrik V. and Giovanni V.” Static Disassembly of Obfuscated Binaries” in In Proceedings of USENIX Security (USENIX04)
[12]Xin H., Kang G., Sandeep B., and Kent G.” MutantX-S: Scalable Malware Clustering Based on Static Features.” in USENIX Annual Technical Conference, page 187-198. USENIX Association, (2013)
[13]Xin H., Kang G.” DUET: integration of dynamic and static analyses for malware clustering with cluster ensembles.” in ACSAC, page 79-88. ACM, (2013)
[14]“Microsoft Portable Executable and Common Object File Format Specification” http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx
[15]Ting W., Xin H., Shicong M., and Reiner S.” Reconciling malware labeling discrepancy via consensus learning.” in ICDE Workshops, page 84-89. IEEE, (2014)
[16]Jeremy Z. Kolter, and Marcus A. “Learning to Detect and Classify Malicious Executables in the Wild.” in Journal of Machine Learning Research (2006)
[17]Konrad R., Philipp T., Carsten W., and Thorsten H.” Automatic analysis of malware behavior using machine learning.” in Journal of Computer Security 19(4):639-668 (2011)
[18]Nello C.,John S.”An Introduction to Support Vector Machines and other kernel-based learning methods. “Cambridge University Press, 2000.
[19]TG Dietterich,” Ensemble learning” in The handbook of brain theory and neural networks, 2002
[20]Xgboost , https://github.com/dmlc/xgboost
[21]Information gain, ” https://www.wikiwand.com/en/Information_gain_in_decision_trees”
[22]O Chapelle, B Schölkopf, A Zien ,“Semi-supervised learning”,2006
[23]IDA Pro,https://www.hex-rays.com/products/ida/

電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊