(3.236.222.124) 您好!臺灣時間:2021/05/11 09:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:紀冬音
研究生(外文):Tung-YinChi
論文名稱:在異質性系統架構上之以Hash為基礎的OpenFlow封包分類
論文名稱(外文):Hash-based OpenFlow Packet Classification on Heterogeneous System Architecture
指導教授:張燕光
指導教授(外文):Yeim-Kuan Chang
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:69
中文關鍵詞:封包分類OpenFlowAPUGPUHash Table異質性系統架構
外文關鍵詞:Packet ClassificationOpenFlowAPUGPUHash TableHSA
相關次數:
  • 被引用被引用:0
  • 點閱點閱:116
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
封包分類對於現今的網路架構而言是一項非常重要的功能,它可以輔助或提供封包轉送、連線品質管理(QoS)、防火牆、流量控制或虛擬私人網路(VPN)等功能或服務。隨著網路的發展與軟體定義網路(SDN)的出現,傳統上對於五維規則所設計的封包分類方法已不足以使用在現今的十二維或更多維度的規則集上。現今所面對的主要問題是如何處理大於十二維度的規則以及內容數量更多的規則集上並且需要能達到高速的封包分類。為了達到高速分類,目前有部分方法是實作在GPU上,其中有一部分是使用單一Hash Table來處理搜尋( [11] ),另一部分是使用Binary Range Tree來搜尋( [6], [10] ),還有一種是使用Hash-based的方法加上Tuple space ([22]),由於十二維規則集的特性,相對於傳統五維新增的維度皆為單一值或是wildcard,因此前述兩種方法並無法很有效率的處理這種特性的規則集。另一方面的問題是,使用一般GPU運算時,需要將輸入資料及運算結果透過PCI-E匯流排來傳輸於GPU記憶體與主記憶體之間,匯流排的延遲是一個很大的瓶頸。
在此論文中,我們提出了一個變形的Hash表來處理皆為單一值或是wildcard的維度,並且透過壓縮的方法來減少記憶體使用量。另一方面,我們將此方法實作在異質性系統架構的APU上,以此來略過匯流排延遲來移除瓶頸。根據實作在AMD A10-7850 APU上的實驗結果,我們的方法可以在使用12K條的十二維規則集時達到1586到1983 MPPS (Million Packet Per Second)的流量,相較於將相同方法實作於傳統的GPU上,流量有近十倍的改進。相較於實作在FPGA的方法( [3] )增加了約600 MPPS。相較於其他實作於GPU方法( [6], [20] ),流量約有10到40倍的改善。
Packet classification is a very important component for today’s network architecture, it can help or provide packet forwarding, Quality of Service (QoS), firewall, traffic control, or virtual private network (VPN). With the development of Internet and the emergence of software-defined networking (SDN), the methods designed for the traditional 5-dimensional rule set is not efficiently to process the current rule set that contains 12-dimensional or more dimensions rules. The main problem is how to process the rule sets those are 12-dimensional or more dimension and must can achieve high throughput. To achieve high throughput, there are some methods are implemented on GPU, some of them use a single hash table to process the searching ( [11], [21] ), some of anothers use Binary Range Tree to process ( [6], [10], [20] ), and [22] uses hash-based method and tuple space method. Because of the properties of 12-dimensional rule sets, the new 7 fields are all exact value or wildcard, use the single hash table or binary range tree is not efficiently. There is another problem, when use the normal GPU the process computing, we must transfer the input data and results with the PCI-E bus, the bus latency is a big bottleneck.
In this thesis, we propose a modified hash table to process the exact value or wildcard fields, and use the compressing method to reduce memory consumption. In the other hand, we implement this method on APU that uses Heterogeneous System Architecture to skip the bus delay between host and device. According to the experimental result on AMD A10-7850 APU, the throughput of our method can achieve 1586 to 1983 MPPS (Million Packet Per Second) throughput when the rule sets contain 12K 12-dimensional rules. Also the memory consumption of our proposed scheme is 38 MB. The throughput our proposed scheme is 10 times of the throughput of implementing the same method on legacy GPU. The method implemented on FPGA ([3]) can achieve 1250 MPPS, our scheme can achieve higher throughput. The throughput of our scheme is 10 to 40 times of other GPU-based method ([6], [11]).
摘要. i
Abstract . ii
誌謝. iv
TABLE OF CONTENTS . v
LIST OF TABLES. vii
LIST OF FIGURES. viii
Chapter 1 Introduction . 1
Introduction. 1
Organization of the Thesis . 3
Chapter 2 Related work. 4
OpenFlow. 4
2.1.1 Flow Table . 4
2.1.2 OpenFlow Switch components. 6
2.1.3 Flexible Rule Generator (FRuG) . 7
Cuckoo Hashing. 8
Bloom Filter . 9
Accelerated Processing Unit (APU). 9
2.4.1 Heterogeneous System Architecture (HSA) . 12
2.4.2 Heterogeneous Uniform Memory Access (hUMA). 16
2.4.3 Heterogeneous Queuing (hQ). 17
2.4.4 C++ Accelerated Massive Parallelism (C++ AMP). 18
2.4.5 HCC compiler. 19
2.4.6 Low Level Virtual Machine (LLVM) . 19
2.4.7 Clang compiler . 20
Chapter 3 Proposed scheme. 21
Overview. 21
3-Layer Hash Tree. 24
3.2.1 Bloom Filter and Possibility Bitmap . 24
3.2.2 Layer 1. 34
3.2.3 Layer 2. 37
3.2.4 Layer 3. 40
Compress L2 hash table . 44
Cache and MicroFlow table . 48
Update compressed table . 49
Optimize the utilization of Stream Processors (SP Optimize) . 50
Update data structure of Bloom filter phase . 51
Chapter 4 Experimental Result. 53
Experimental Environment . 53
Experimental Result. 56
Chapter 5 Conclusion. 66
References . 67
[1] W. Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA, IEEE Trans. VLSI Syst., vol. 20, no. 9, pp. 1668–1680, 2012.
[2] V. Pus, J. Korenek, and J. Korenek, “Fast and Scalable Packet Classification using Perfect Hash Functions, in Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays (FPGA), 2009, pp. 229–236.
[3] T. Ganegedara and V. K. Prasanna, “StrideBV: Single Chip 400G+ Packet Classification, in 13th IEEE International Conference on High Performance Switching and Routing (HPSR), 2012, pp. 1–6.
[4] Y. Ma, S. Banerjee, S. Lu, and C. Estan, “Leveraging Parallelism for Multi-dimensional Packet Classification on Software Routers, SIGMETRICS Perform. Eval. Rev., vol. 38, no. 1, 2010, pp. 227–238.
[5] Yun Qu, Shijie Zhou, Viktor K. Prasanna, “Scalable Many-Field Packet Classification on Multi-core Processors, in 25th International Symposium on Computer Architecture and High Performance Computing, 2013, pp. 33-40.
[6] Shijie Zhou, Shreyas G. Singapura, Viktor K. Prasanna, “High-performance packet classification on GPU, High Performance Extreme Computing Conference (HPEC), 2014, pp. 1-6.
[7] T. Ganegedara, W. Jiang, and V. Prasanna, Frug: A benchmark for packet forwarding in future networks, in Proc. IPCCC '10, 2010.
[8] OpenFlow Foundation, “OpenFlow Switch Specification Version 1.0.0.Available: http://www.openflowswitch.org/documents/openflow-spec-v1.0.0.pdf
[9] HSA foundation, “DRAFT HSA Platform System Architecture Specification 1.1 Available: http://www.hsafoundation.com/?ddownload=5114
[10] Yun R. Qu, Hao H. Zhang, Shijie Zhou, Viktor K. Prasanna, “Optimizing many-field packet classification on FPGA, multi-core general purpose processor, and GPU, Architectures for Networking and Communications Systems (ANCS), 2015, pp.87-98.
[11] Nobutaka Matsumoto, Michiaki Hayashi, “LightFlow: Speeding up GPU-based flow switching and facilitating maintenance of flow table, IEEE 13th International Conference on High Performance Switching and Routing, 2012, pp.76-81.
[12] Andrei Broder, Michael Mitzenmacher, Andrei Broder I Michael Mitzenmacher, “Network Applications of Bloom Filters: A Survey, Internet Mathematics, 2004, pp.485-509.
[13] Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher, “Cuckoo Filter: Practically Better Than Bloom, 10th ACM International on Conference on emerging Networking Experiments and Technologies, 2014, pp.75-88.
[14] Wikipedia, “Heterogeneous System Architecture, Available: https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture
[15] Weirong Jiang, Viktor K. Prasanna, Norio Yamagaki, “Decision Forest: A Scalable Architecture for Flexible Flow Matching on FPGA, International Conference on Field Programmable Logic and Applications, 2010, pp.394-399.
[16] Yanbiao Li, Dafang Zhang, Alex X. Liu, Jintao Zheng, “GAMT: a fast and scalable IP lookup engine for GPU-based software routers, Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems, 2013, pp.1-12.
[17] Luke McHale, Jasson Casey, Paul V. Gratz, Alex Sprintson, “Stochastic Pre-Classification for SDN Data Plane Matching, IEEE 22nd International Conference on Network Protocols, 2014, pp.596-602.
[18] Nen-Fu Huang, Shi-Ming Zhao, Jen-Yi Pan, Chi-An Su, “A Fast IP Routing Lookup Scheme for Gigabit Switching Routers, INFOCOM '99. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE (Volume:3), 1999, pp.1429 - 1436 vol.3
[19] Kate Gregory, Ade Miller, “C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++ 2012.
[20] Cheng-Liang Hsieh, Ning Weng, “Many-Field Packet Classification for Software-Defined Networking Switches, ANCS '16 Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, 2016, pp.13-24
[21] Voravit Tanyingyong, Markus Hidell, Peter Sjödin, “Using hardware classification to improve PC-based OpenFlow switching, IEEE 12th International Conference on High Performance Switching and Routing, 2011, pp.215-221
[22] Matteo Varvelloy, Rafael Laufer, Feixiong Zhang, T.V. Lakshman, “Multi-Layer Packet Classification with Graphics Processing Units, Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, 2014, pp.109-120
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔