跳到主要內容

臺灣博碩士論文加值系統

(44.200.77.92) 您好!臺灣時間:2024/02/27 06:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:賴谷鑫
研究生(外文):Gu-Hsin Lai
論文名稱:適應性之伺服器端垃圾郵件過濾系統之研究
論文名稱(外文):An Adaptive Server-Side Anti-Spam System
指導教授:陳嘉玫陳嘉玫引用關係鄭炳強鄭炳強引用關係
指導教授(外文):Chia-Mei ChenBing-Chiang Jeng
學位類別:博士
校院名稱:國立中山大學
系所名稱:資訊管理學系研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:英文
論文頁數:76
中文關鍵詞:資料探勘垃圾郵件統計檢定
外文關鍵詞:Data miningStatistical TestingSpam mail
相關次數:
  • 被引用被引用:0
  • 點閱點閱:502
  • 評分評分:
  • 下載下載:91
  • 收藏至我的研究室書目清單書目收藏:0
垃圾郵件的氾濫已經成為網際網路的重大威脅,除了商業郵件外,一些惡意訊息如網路釣魚、網路詐欺、色情訊息以及惡意程式都是透過垃圾郵件散佈。垃圾郵件對於個人、組織以及社會都有重大的影響,因此解決垃圾郵件問題是當務之急。
一個實用的伺服器端郵件過濾器需要有三種能力:(1) 如何精確的過濾大量的垃圾郵件;(2) 垃圾郵件過濾器如何認得新型態的垃圾郵件以及(3) 郵件伺服器如何自動化的管理日益增多的垃圾郵件法則。而當前有關垃圾郵件之研究大多著重在單一面向(著重於垃圾郵件法則的建立)。但是在真實世界上,垃圾郵件的預防不僅僅於應用資料探勘技術產生垃圾郵件法則以過濾垃圾郵件。真實世界的垃圾郵件防治必須考量到除了垃圾郵件法則產生以外的其他議題。
本研究提出並整合了三個子系統做為垃圾郵件防治的架構。這三個子系統分別為垃圾郵件法則產生子系統;垃圾郵件法則分享子系統以及垃圾郵件法則管理子系統。本研究利用法則基礎(Rule-Based)之資料探勘演算法產生可分享以及可管理的垃圾郵件法則;而最新之垃圾郵件資訊則透過XML檔案格式在伺服器之間彼此分享;垃圾郵件法則的管理藉由統計檢定之方式自動之啟動有效的法則以及停用不精確的法則達成。本研究預計設計並且整合此三個子系統已達到垃圾郵件防治的目標。
The spread of spam mails have become a serious threat in the Internet. In addition to commercial messages, some malicious messages such as phishing, pornography messages, fraudulent messages and malicious codes are spread via spam.
A practical server-side anti-spam system should have ability to (1) filter out growing volume of spam mails correctly; (2) recognize new type of spam mails and (3) manage the increasing spam rules automatically. Most work only focused on single aspect (especially for spam rule generation) to prevent spam mail. However, in real world, spam prevention is not just applying data mining algorithm for rule generation. To filter out spam mails correctly and efficiently in a real world, there are still many issues should be considered in addition to spam rule generation.
In this research, we propose and integrate three sub-systems to form a practical anti-spam system, the sub-systems are spam rule generation sub-system, spam rule sharing sub-system and spam rule management sub-system. In this research, rule-based data mining approach is used to generate manageable and shareable spam rules. The latest spam rules are shared through machine-readable XML format. Spam rules stored in mail servers are managed based on statistical testing approach. The Rule management sub-system can automatically enable high performance rules and disable out-of-date rules to improve the miss rate and efficiency of spam filter. This research will develop and integrate the three sub-systems to achieve the goal of spam prevention.
1. Introduction 1
2. Literature Review 6
2.1 Overview of Anti-Spam Solutions 6
2.2 Mail feature selection review 10
2.3 Mail filter review 12
3. The Proposed Approach 14
3.1 Rules Generation Sub-system 14
3.2 Spam rule sharing 21
3.3 Spam Rule Management 24
3.4 Statistical Model 27
3.5 Rule Conflict 34
4. System Demonstration 40
5. Performance Evaluation 46
5.1 Performance Metrics 46
5.2 Experiments environment 47
5.3 Evaluation of rule sharing 51
5.4 Evaluation of rule management 56
5.5 Evaluation of proposed approach 60
6. Conclusion 66
7. Reference 67
M. Abadi, M. Burrows, M. Manasse, T.Wobber, 2005, "Moderately hard, memory-bound functions", ACM Transactions on Internet Technology, Vol.11, No.5, pp.299-327
A. Chouchoulas, “A Rough Set-Based Approach to Text Classification”, Lecture Notes in Computer Science, 2004, Vol. 1711, pp. 118-127.
X. Carreras, L. Marquez, “Boosting Trees for Anti-Spam Email Filtering”, 4th International Conference on Recent Advances in Natural Language Processing, 2001
J. Clark, I. Koprinska and J. Poon,“A neural network based approach to automated e-mail classification”, IEEE/WIC International Conference on Web Intelligence, 2003, pp:702 -705
L.F. Cranor, and B.A. LaMacchia, “Spam!”, Communications of the ACM, 1998, Vol. 41, No.8, pp. 74-83.
H. Drucker, D. Wu and V.N. Vapnik, "Support vector machines for spam categorization", IEEE Transactions on Neural Networks, 1999, Vol.10, No.5, pp. 1048-1054
P.Gburzynski and G.Maitan, "Fighting the spam wars: A remailer approach with restrictive aliasing", ACM Transactions on Internet Technology, 2004, Vol.4, No.1, pp.1-30
R. J. Hall, “How to avoid unwanted email”. Communications of the ACM, 1998, Val.41, No.3, pp.88-95
Hashcash, 2003, http://www.hashcash.org/
J. Hidalgo, "Evaluating cost-sensitive unsolicited bulk email categorization," in proceedings of the 2002 ACM Symposium on Applied Computing, 2002, pp. 615-620.
H. Katirai, "Filtering Junk E-Mail: A Performance Comparison between Genetic Programming and Naive Bayes,", technical report, Available: http://members.rogers.com/hoomank/papers/katirai99filtering.pdf, 1999.
Lucent Personal Web Assistant,2009, http://lpwa.com.
G. H. Lai, Chia-Mei Chen, Y. F. Chiu, C. S. Laih, and T. Chen, “A Collaborative Approach to Anti-Spam,” 20th Annual FIRST Conference, 2008
K. Li and H. Huang, ”An architecture of active learning SVMs for spam”, 6th International Conference on Signal Processing, 2002, Vol.2 pp:1247-1250
Z. Pawlak =, Rough sets and intelligent data analysis, Information Sciences, 2002, Vol.147, No. 1-4 , pp:1-12
Rosetta. http://www.idt.unit.no/~aleks/rosetta/rosetta.html
M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, “A Bayesian approach to filtering junk e-mail”. In Proceedings of Workshop on Learning for Text Categorization, 1998
A. Skowron and N. Son , “Boolean Reasoning Scheme with Some Applications in Data Mining”, Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery, 1999, pp:107-115
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998
M. Woitaszek, M. Shaaban, R. Czernikowski, “Identifying junk electronic mail in Microsoft outlook with a support vector machine”, Symposium on Applications and the Internet, 2003, pp:66 -169
J. Wrblewski, “Finding Minimal Reducts Using Genetic Algorithms”, Proceeding of the Second Annual Joint Conference on Information Sciences, 1995 pp.186-189
W. Zhao and Z. Zhang, “An email classification model based on rough set theory”, Proceedings of the International Conference on Active Media Technology, 2005, pp:403-408
W. Zhao and Y. Zhu, “An Email Classification Scheme Based on Decision-Theoretic Rough Set Theory and Analysis of Email Security”, IEEE TENCON, 2005, pp:1-6
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top