臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.176) 您好！臺灣時間：2025/09/07 21:57

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

田偉安

研究生(外文):

Wei-An Tien

論文名稱:

基於文本分析之惡意程式分類系統

論文名稱(外文):

Malware Classification via n-gram Analysis

指導教授:

游家牧

指導教授(外文):

Chia-Mu Yu

口試委員:

張經略、周耀新

口試委員(外文):

Ching-Lueh Chang、Yao-Hsin Chou

口試日期:

2015-11-14

學位類別:

碩士

校院名稱:

元智大學

系所名稱:

資訊工程學系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2016

畢業學年度:

104

語文別:

中文

論文頁數:

中文關鍵詞:

惡意程式分析、機器學習、特徵擷取、資訊獲利、字元陣列

外文關鍵詞:

Malware analysis、feature selection、information gain、n-gram、Machine Learning

相關次數:

被引用:0
點閱:422
評分:
下載:0
書目收藏:0

由於惡意程式數量急遽增加，為了要辨識惡意程式，發展出靜態及動態惡意程式分析方法，根據資料來源種類，本研究採取靜態分析方法，而分類對象為轉譯過後的組合語言及二進位檔。
首先觀察資料內容，來擷取有興趣的特徵，例如組合語言指令、函式庫名稱、跳躍位置等等，並對二進位檔進行字元陣列的特徵擷取，將收集過後的特徵，透過資訊獲利演算法，來給予權重，選取重要特徵。
最後以機器學習的方式，讓系統經由訓練資料，了解各類別的惡意程式，學習各類惡意程式的特徵，再透過半監督學習的方式，模擬出更多的訓練集樣本，藉此加強分類效果，並以Xgboost機器學習工具[20]，加快系統分析惡意程式的速度，實驗結果顯示，本系統對於惡意程式分類準確率為百分之八十五。

This thesis aims at developing an efficient automatic malware classification system based on static analysis. We collected 21,746 malware samples from Microsoft Malware Classification Challenge (BIG 2015). Each sample contains binary code and assembly code. To build up our classification system, we use n-gram as a means to capture malware characteristics. More specifically, malware features include Dynamic Link Library (DLL), Function Calls, Assembly code, Binary code. Then, information gain is used to select distinguishing features. Our proposed system is able to automatically classify know malware with similar features and unknown malware. Our experimental results show that our proposed system for malware classification may achieve 85% accuracy.

第一章導論 1
1.1研究背景與動機 1
1.2 章節介紹 2
第二章相關研究 2
第三章系統架構 3
第四章特徵蒐集 4
4.1 二進位特徵蒐集 4
4.2 二進位特徵選取 5
4.3 組合語言特徵蒐集 8
4.4 動態連結庫特徵蒐集 8
4.5 跳躍地址特徵蒐集 9
4.6 其他特徵蒐集 9
第五章分類 9
5.1 模型描述 9
5.2 特徵向量計算 10
5.3 分類演算法 10
5.4 整體學習 10
5.5半監督學習 11
5.6 Xgboost 13
第六章實驗及結果 13
6.1 實驗環境 13
6.2 實驗結果 14
6.3 實驗比較 14
第七章結論及未來展望 15
參考文獻 15

[1]Nick Hnatiw, Tom Robinson, Casey Sheehan, Nick Suan,” Pimp My PE: Parsing Malicious and Malformed Executables”
[2]Schultz, M., Eskin, E., Zadok, E. ,”MEF Malicious email filter,a UNIX mail filter that detects malicious windows executables.” in Proc. of the USENIX annual technical conference—FREENIX track (pp. 245–252).
[3]Schultz, M., Eskin, E., Zadok, E., and Stolfo, S.”Data mining methods for detection of new malicious executables.” in Proc. of the IEEE symposium on security and privacy (pp. 178–184).
[4]Qinghua Z., Douglas S” Metaaware: Identifying metamorphic malware,” in ACSAC, 2007, pp. 411–420.
[5]L. Nataraj, S. Karthikeyan, G Jacob, and BB. Manjunath,“Malware Images: Visualization and Automatic Classification” in Proceedings of the 8th International Symposium on Visualization for Cyber Security. ACM,2011, P.4
[6]Tzu-Yen W.,Chin-Hsiung W.,Chu-Cheng H.” A Virus Prevention Model Based on Static Analysis and Data Mining Methods” in CIT Workshops 2008
[7]Mohammad M. Masud, Latifur Khan, and B.Thuraisingham.” A scalable multi-level feature extraction technique to detect malicious executables.” In Information Systems Frontiers 10(1):33-45 (2008)
[8]Mohammad M. Masud, Latifur Khan, and B.Thuraisingham.” A hybrid model to detect malicious executables.” in Proc. of the IEEE international conference on communication (ICC’07) (pp. 1443–1448)
[9]Mohammad M. Masud, Latifur Khan, and B.Thuraisingham.” Feature based techniques for auto-detection of novel email worms.” in Proc. of the eleventh Pacific-Asia conference on knowledge discovery and data mining (PAKDD’07) (pp. 205–216).
[10]Lakhotia, A., Kumar, E. U., and Venable, M. “A method for detecting obfuscated calls in malicious binaries.” in IEEE Transactions on Software Engineering, 31(11), 955–968.
[11]Christopher K., William R., Fredrik V. and Giovanni V.” Static Disassembly of Obfuscated Binaries” in In Proceedings of USENIX Security (USENIX04)
[12]Xin H., Kang G., Sandeep B., and Kent G.” MutantX-S: Scalable Malware Clustering Based on Static Features.” in USENIX Annual Technical Conference, page 187-198. USENIX Association, (2013)
[13]Xin H., Kang G.” DUET: integration of dynamic and static analyses for malware clustering with cluster ensembles.” in ACSAC, page 79-88. ACM, (2013)
[14]“Microsoft Portable Executable and Common Object File Format Specification” http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx
[15]Ting W., Xin H., Shicong M., and Reiner S.” Reconciling malware labeling discrepancy via consensus learning.” in ICDE Workshops, page 84-89. IEEE, (2014)
[16]Jeremy Z. Kolter, and Marcus A. “Learning to Detect and Classify Malicious Executables in the Wild.” in Journal of Machine Learning Research (2006)
[17]Konrad R., Philipp T., Carsten W., and Thorsten H.” Automatic analysis of malware behavior using machine learning.” in Journal of Computer Security 19(4):639-668 (2011)
[18]Nello C.,John S.”An Introduction to Support Vector Machines and other kernel-based learning methods. “Cambridge University Press, 2000.
[19]TG Dietterich,” Ensemble learning” in The handbook of brain theory and neural networks, 2002
[20]Xgboost , https://github.com/dmlc/xgboost
[21]Information gain, ” https://www.wikiwand.com/en/Information_gain_in_decision_trees”
[22]O Chapelle, B Schölkopf, A Zien ,“Semi-supervised learning”,2006
[23]IDA Pro,https://www.hex-rays.com/products/ida/

電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	利用整合式的機器學習方式提高垃圾網站偵測率
2.	中文郵件過濾系統特徵選取之效度探討
3.	整合機器學習方法於決策樹為基智慧型排程系統之研究
4.	健康資訊網站之中文醫療問題自動分類-以西醫為例
5.	以文字探勘及機器學習為基礎之電子郵件分類方法設計
6.	應用柔性演算建構整合型之預測與分類系統
7.	以機械學習方式預測藥物之小腸吸收度
8.	基於線性相關性之心律不整特徵選取分類技術
9.	蛋白質與核糖核酸序列的機器學習分析方法
10.	比較特徵選擇與特徵擷取方法在評估中文適讀性的效果
11.	應用支持向量機於植物核醣核酸對微型核醣核酸目標基因預測
12.	以機器學習法尋找有助於預測線形B細胞抗原決定位之特徵
13.	基於虛擬標記資訊之半監督式特徵值擷取演算法
14.	核醣核酸特徵對預測微型核醣核酸目標基因之研究
15.	貝氏稀疏學習法於文件模型及語言模型之建立

無相關期刊

1.	人員緊急追蹤感測器系統
2.	使用機器學習對資產價格進行預測台灣房地產價格的案例研究
3.	在不受信任的雲端儲存上進行針對加密資料的去重複技術
4.	在SDN架構下輕量級防禦UDP洪水攻擊
5.	透過算術Merkle Tree的驗證資料結構
6.	在軟體定義網路的架構下抵禦UDP洪水攻擊
7.	基於無證書簽名的安全重複數據刪除方案
8.	雲存儲的側通道安全性研究
9.	安全的去重複技術應用於P2P網路
10.	保護循序電路之基於布林滿足性的有限狀態機浮水印技術
11.	地方藝文中心與社區文化營造之探討─以中壢藝術館為例
12.	C波段陸基合成孔徑雷達之射頻前端模組研製
13.	物聯網中裝置自動化控管方法與相關閘道架構
14.	影像為基礎之車輛後方警示系統
15.	基於雜湊函式的漸進式影像匹配

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室