跳到主要內容

臺灣博碩士論文加值系統

(44.220.251.236) 您好!臺灣時間:2024/10/08 11:35
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:洪偉達
研究生(外文):Hong, Wei-Da
論文名稱:用於網路漏洞掃描器之選擇性進入點探查
論文名稱(外文):Selective Entry Point Crawling for Web Vulnerability Scanner
指導教授:謝續平謝續平引用關係
指導教授(外文):Shieh, Shiuhpyng
口試委員:范俊逸曾文貴周國森
口試委員(外文):Fan, Chun-ITzeng, Wen-GueyChou, Kuo-Sen
口試日期:2017-08-17
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊科學與工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:106
語文別:英文
論文頁數:30
中文關鍵詞:漏洞偵測網路安全網路爬蟲
外文關鍵詞:Vulnerability TestingWeb SecurityWeb Crawler
相關次數:
  • 被引用被引用:0
  • 點閱點閱:226
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在對網站進行漏洞掃描的過程中,由於進入點(Entry Point)是多種注入性攻擊的輸入途徑,故進入點的尋找是最優先且重要的第一步,也因此網站漏洞偵測工具(Web Vulnerability Scanner)通常會包含一個爬蟲(Crawler)元件,以從目標網站萃取出進入點。
隨著新的網路技術被應用在網站實作,為了應對這些新的網站實作技術,爬蟲的相關研究及實作也需要不斷的改進。以網站漏洞掃描為例,由於進入點等價的特性,若爬蟲缺乏一個好的選擇機制(Selection Policy)來選擇下一個分析的頁面,則爬蟲容易將大量的計算資源消耗在類似的頁面中,導致爬蟲只能蒐集到重覆的進入點,致使後續之漏洞掃瞄無法觸發更多不同的程式區段。目前在網路爬蟲的研究領域中,雖已有針對選擇機制所提出的研究,但這些研究主要是針對網站結構描繪或是搜尋特定網站內容,並非針對網站漏洞掃描的需求所設計,因此其選擇機制並不能很好的幫助爬蟲找出較適合用於網站漏洞掃描的進入點。
本研究提出一個新的選擇機制,以針對網站漏洞掃描之需求,改善網站進入點的搜尋結果。其大致構想為:蒐集尚未分析之網頁、計算其內容的結構差異、將這些結構差異量化、最後依據量化結果來選擇當下最需要分析的頁面。為了驗證這個選擇機制在進入點搜尋上的效果及執行效率,本論文將這個選擇機制實作至現有的爬蟲系統,再利用eBay網站、BBC網站、及WIVET Benchmark作為實測資料。從本研究一系列的實驗可看出,本研究所提出的選擇機制確實能幫助網路爬蟲搜集更多不同的進入點,以供網站漏洞掃瞄所用。
Entry point collection is the most import preparation work for web vulnerability scanning, because various serious web injection attacks, such as SQL injection and XSS, are conducted over these entry points. As a result, most web vulnerability scanners (WVS) are equipped with a crawler, which is used to collect entry points from target web site.
As the adoption of new techniques in modern web developments, the methodology of web crawlers needs to be improved for various websites under different scenarios. For example, web vulnerability testing is one of the most important applications of web crawlers. Web crawlers are prone to spend major part of computation resource on analyzing similar pages if the crawlers lack of a selection policy for selecting the next web page to be analyzed. As a result, these web crawlers may find duplicate entry points due to the equivalence property of entry points. Although there are already some research works related to the selection policies, these selection policies focus on web structure sketching, content-interest searching, or other user-oriented purposes. Therefore, these selection policies are not suitable to crawl entry points for web vulnerability testing.
This thesis proposes a novel selection policy, which improves existing crawlers to meet the requirement of web vulnerability testing. The main design is: collect not-analyzed-yet web pages, calculate the structural diversities between these web pages, quantize these diversities, and select the next web page to be analyzed according to the aforementioned quantized result. In order to validate the effectiveness and the efficiency of proposed selection policy, we implement and integrate this policy into an existing web crawler. The field tests are against eBay website, BBC website, and WIVET benchmark. The results show that proposed selection policy does help web crawlers to crawl more different entry points suitable for web vulnerability scanners.
摘要 i
Abstract ii
致謝 iii
Table of Contents iv
List of Figures v
1. Introduction 1
2. Background 5
2.1. Design and Workflow of Web Crawlers 5
2.2. Evolution of Web Crawlers 6
2.3. Requirements of a Crawler for Web Vulnerability Scanners 7
3. Related Work 9
3.1. Page Analysis 9
3.2. Selection Policy 10
4. Proposed Scheme 12
4.1. Architecture 13
4.2. Page Analyzer 14
4.3. Scheduler 15
5. Evaluation 22
5.1. Functionality of Page Analyzer 22
5.2. Number of Entry Points 22
5.3. Overhead of Selection Policy 25
6. Conclusion 27
7. References 29
[1] Dan. (2011, Aug 8) How Big is a Large Website? [Online]. Available: http://contentini.com/how-big-is-a-large-website-planning-the-content-audit-app/
[2] T. Bennouas and F. de Montgolfier, “Random web crawls,” in Proc. 16th Int. Conf. World Wide Web, 2007, pp. 451–460.
[3] Y.-W. Huang, S.-K. Huang, T.-P. Lin, and C.-H. Tsai, “Web application security assessment by fault injection and behavior monitoring,” in Proc. 12th Int. Conf. World Wide Web, 2003, pp. 148–159.
[4] S. M. Mirtaheri et al., “A Brief History of Web Crawlers,” in Proc. 2013 Conf. Center for Advanced Studies on Collaborative Research, 2013, pp. 40–54.
[5] S. Gupta and K. Bhatia, “A Comparative Study of Hidden Web Crawlers,” Int. J. Comput. Trends Technol., vol. 12, 2014.
[6] E. İ. Tatli and B. Urgun, “WIVET—Benchmarking Coverage Qualities of Web Crawlers,” Computer J., vol. 60, no 4, pp. 555-572, Sep. 2016. Available: https://github.com/bedirhan/wivet
[7] A. M. Fard and A. Mesbah, “Feedback-Directed Exploration of Web Applications to Derive Test Models,” in Proc. 2013 IEEE 24th Int. Symp. Software Reliability Engineering, pp 278-287
[8] G. Pellegrino et al., “jÄk: Using Dynamic Analysis to Crawl and Test Modern Web Applications”, in Proc. 18th Int. Symp. Research in Attacks, Intrusions, and Defenses, 2015, pp 295-316.
[9] A. Mesbah et al., “Crawling Ajax-based Web Applications through Dynamic Analysis of User Interface State Changes”, ACM Transactions on the Web, vol. 6, no. 1, Mar. 2012.
[10] A. Huiyao et al., “A New Architecture of Ajax Web Application Security Crawler with Finite-State Machine”, in Proc. 6th Int. Conf. Cyber-Enabled Distributed Computing and Knowledge Discovery, 2014.
[11] T. Fu et al., “Sentimental Spidering: Leveraging Opinion Information in Focused Crawlers,” ACM Trans. Inf. Syst., vol. 30, no. 4, pp. 1–30, Nov. 2012.
[12] S. Xu et al., “A user-oriented web crawler for selectively acquiring online content in e-health research.,” Bioinformatics, vol. 30, no. 1, pp. 104–14, Jan. 2014.
[13] R. Baeza-Yates et al., “Crawling a Country: Better Strategies than Breadth-First for Web Page Ordering,” in Proc. of the 14th Int. Conf. World Wide Web, 2005, pp. 864–872.
[14] IETF. (1994). RFC 1738 – Uniform Resource Locator [Online]. Available: https://www.ietf.org/rfc/rfc1738.txt
[15] WAVSEP. (2016, Sept. 18) The WIVET Coverage Ranking of Web Application Vulnerability Scanners [Online] Available: http://sectoolmarket.com/wivet-score-unified-list.html
[16] W3C. (2017, Mar. 30) WebDriver [Online] Available: https://www.w3.org/TR/webdriver/
[17] IETF. (1994). RFC 1738 – Uniform Resource Locator [Online]. Available: https://www.ietf.org/rfc/rfc1738.txt
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top