跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.215) 您好!臺灣時間:2025/11/26 09:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林義能
研究生(外文):Yi-Neng Lin
論文名稱:多執行緒多處理器網路處理器之資源分配--針對計算密集及記憶體存取密集的網路應用程式
論文名稱(外文):Resource Allocation in Multithreaded Multiprocessor Network Processors for Computational Intensive and Memory Access Intensive Network Applications
指導教授:林盈達林盈達引用關係
指導教授(外文):Ying-Dar Lin
學位類別:博士
校院名稱:國立交通大學
系所名稱:資訊科學與工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:英文
論文頁數:105
中文關鍵詞:網路處理器資源分配
外文關鍵詞:network processorresource allocatin
相關次數:
  • 被引用被引用:0
  • 點閱點閱:317
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
今日網路應用程式之處理需要強大的硬體平台以應付日益龐大的計算量以及記憶體存取。此平台亦必須能夠隨著協定或產品規格之變動而作有效的調整。沿用已久的多用途處理器架構,其效能往往被“核心-使用者程式”間的溝通以及執行緒轉換的負擔拖累;而常用的ASIC解決方式則受限於開發時程過久且調整不易的缺陷而無法滿足需求。
本篇論文主要探討(1)應用日益盛行的網路處理器架構來加速網路網路封包處理的可行性,此網路處理器包含多個處理器且每個處理器包含多個硬體執行緒,具有豐富硬體資源、較小的執行緒轉換負擔以及可調整性等優點,和(2)用此平台來處理不同計算或記憶體存取量的網路應用程式時硬體資源的分配。我們首先檢視各種不同的網路處理器並將其分成“助理處理器為主”和“核心處理器為主”兩大類。就前者而言,助理處理器負責占封包處理主要工作的資料面象部分,而後者則是由核心處理器兼顧所有的控制面象和大部分的資料面象的處理。之後我們針對計算密集以及記憶體存取密集的網路應用程式分別用“助理處理器為主”和“核心處理器為主”的兩種網路處理器來實作並評估其效能。最後,根據實作的經驗我們進一步設計出其數學模型以及模擬環境,以期能找出設計、使用此二種架構時的參考。
Networking applications today demand a hardware platform with stronger computational or memory access capabilities as well as the ability to efficiently adapt to changes of protocols or product specifications. Being the ordinary options, however, neither a general purpose processor architecture, which is usually slowed down by kernel-user space communications and context switches, nor an ASIC, which lacks the flexibility and requires much development period, measures up.
In this thesis, we discuss (1) the feasibility of applying the emerging alternative, network processors featuring the multithreaded multiprocessor architecture, rich resources, minor context switch overhead, and flexibility, to solve the problem, and (2) the ways of exploiting those resources when dealing with applications of different computational and memory access requirements. We start by surveying network processors which are then categorized into two types, the coprocessors-centric and the core-centric ones. For the former, the coprocessors take care of the data plane manipulation whose load is usually much heavier than the one of the control plane, while in the latter the core processor handles the most part of packet processing, including the control plane and data plane. After that we evaluate real implementations of computational intensive and memory access intensive applications over the coprocessors-centric and core-centric platforms, respectively, aiming to unveil the bottlenecks of the implementations as well as the allocation measures. Finally, based on the evaluations, analytical models are formalized and simulation environments are built to observe possible design implications for these two types of network processors.
1. Introduction 7
1.1 Challenges of Hardware Platforms for Modern Networking Applications 7
1.2 The Importance of Resource Allocation for Network Processors 8
1.3 Coprocessors-centric and Core-centric Network Processors 8
1.4 Related Works 9
1.4.1 Application Design and Implementation 10
1.4.2 Mathematical Modeling and Simulation 11
1.5 Thesis Objective and Dissertation Road Map 12
2. Research Methodologies 14
2.1 Application Design and Implementation 14
2.2.1 Software Architecture of IXP425 14
2.2.2 Software Architecture of IXP2400 15
2.2.3 Performance Benchmark 16
2.2 Mathematical Modeling and Simulation 17
3. Resource Allocation of the Coprocessors-centric Network Processors for Memory Access Intensive Applications 19
3.1 Introduction 19
3.2 Hardware Platform (IXP2400) 20
Detailed Packet Flow in IXP2400 22
3.3 Problem Statements 23
3.4 Design and Implementation 24
3.4.1 NIDS Briefing 24
3.4.2 Design Issues 25
3.4.3 Mapping Processing Stages to the Hardware Platform 26
3.4.4 Algorithms Adopted and Packet Inspection 28
3.5 System Benchmark and Bottleneck Analysis 30
3.5.1 Benchmark Setup 30
3.5.2 Effect of Improper ME/Thread Allocations 31
3.5.3 Estimating the Optimal (I,J) Pair 33
3.5.4 Effectiveness of Multiple Memory Banks 35
3.6 Summary 36
4. Coprocessors-centric Network Processors: Analysis, Simulation, and Design Implications 38
4.1 Introduction 38
4.2 Effect of Different Thread Allocation Schemes 39
4.3 Overview of the Analytical Model 41
4.4 Markov Chain Formalization 42
4.4.1 State Definition and State Space Determination 42
4.4.2 Determination of the Status Transition Diagram and State Transition Matrix 44
4.4.3 Determination of the State Transition Matrix 46
4.4.4 Performance Estimation for the Analytical Model 48
4.5 Simulation and Analytical Model validation 49
4.5.1 Design of the Petri Net Based Simulation Environment 49
4.5.2 Model Validation By the Simulation 52
4.5.3 Simulation Setup 53
4.5.4 Effect of the RSS Memory Queuing Discipline 54
4.5.5 Unbalanced Load among Threads 55
4.5.6 Simulations with Three P-M Ratios 56
4.5.7 Solutions for the Memory Bottleneck 59
4.6 Summary 60
5. Resource Allocation of the Core-centric Network Processor for Computational Intensive Applications 63
5.1 Introduction 63
5.2 Hardware Platform (IXP425) 65
5.2.1 Hardware Architecture of IXP425 65
5.2.2 Detailed Packet Flow in IXP425 66
5.2.3 Software Architecture of IXP425 66
5.3 Processing Stages Analysis and Offloading Schemes Design 68
5.3.1 VPN Briefing 68
5.3.2 Identifying Offloading Candidates 69
5.3.3 Implementation 70
5.4 Benchmark and Bottleneck Observations 71
5.4.1 System Benchmark Setup 71
5.4.2 Scalability Test 72
5.4.3 Bottleneck Analysis 74
5.4.4 Turnaround Time Analysis of Functional Blocks 77
5.5 Summary 78
6. Core-centric Network Processors: Analysis, Simulation, and Design Implications 80
6.1 Introduction 80
6.2 Background 82
6.2.1 Performance Model Overview 82
6.2.2 Architectural Assumptions 83
6.3 Analytical Model 84
6.3.1 The Busy-waiting Model 84
6.3.2 The Interrupt-driven Model 85
6.4 Simulation Environment 87
6.5 Evaluation 89
6.5.1 Validation of the Analytical Model 89
6.5.2 Differentiated Run Lengths 92
6.5.3 Effect of the Context Switch Overhead 93
6.5.4 Benefit from Offloading 93
6.5.5 Effect of Limited Buffer Sizes 95
6.6. Summary 96
7. Conclusions 98
Bibliography 100
[AAP04] S. Antonatos, K. G. Anagnostakis, M. Polychronakis, and E. P. Markatos, “Performance Analysis of Content Matching Intrusion Detection Systems,” Proc. of the International Symposium on Applications and the Internet (SAINT2004), January 2004.
[AC75] A. Aho and M. Corasick, “Efficient string matching: An aid to bibliographic search,” Communications of the ACM, vol. 18 issue 6, P.333-340, 1975.
[ARB02] M. Adiletta, et al., “The Next Generation of Intel IXP Network Processors,” Intel Technology Journal, vol.6 issue 3, 2002.
[Atk95] R. Atkinson, “Security architecture for the Internet protocol,” RFC1825, IETF Network Working Group, August 1995.
[BDE01] W. Bux, W. E. Denzel, T. Engbersen, A. Herkersdorf, and R. P. Luijten, “Technologies and Building Blocks for Fast Packet Forwarding,” IEEE Communications Magazine, January 2001.
[BGK+99] T. Braun, M. G�刡ter, M. Kasumi and I. Khalil, “Virtual Private Network Architecture,” Technical Report IAM-99-001, CATI, April 1999.
[BH04] H. Bos and K. Huang, “A network instruction detection system on IXP1200 network processors with support for large rule sets,” Leiden Univeristry Techical Report 2004-02.
[BH95] G. Byrd and M. Holliday, “Multithreaded Processor Architectures,” IEEE Spectrum, vol. 32 issue 8, 1995.
[CB02] P. Crowley and J.-L. Baer, “A Modeling Framework for Network Processor Systems,” Proc. of the Network Processor Workshop in conjunction with Eighth International Symposium on High Performance Computer Architecture (HPCA-8), 2002.
[CFB01] P. Crowley, M. Fiuczynski, and J.-L. Baer, “On the Performance of Multithreaded Architectures for Network Processors,” UW Technical Report, October 2001.
[CLS+04] C. Clark, et al., “A Hardware Platform for Network Intrusion Detection and Prevention," Proc. of the 3rd Workshop on Network Processors and Applications (NP3), Madrid, Spain, February 2004.
[CM06] D. Comer and M. Martynov, “Building Experimental Virtual Routers with Network Processors,” Proc. of the 2nd International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, TRIDENTCOM’06, 2006.
[Com04] D. E. Comer, “Network Systems Design using Network Processors,” p. 282, Prentice Hall, 2004.
[CSI] CSIX-L1: Common Witch Interface Specification, http://www.npforum.org/csixL1.pdf.
[DFL05] J.D. Davis, C. Fu, and J. Laudon, “The RASE (Rapid, Accurate Simulation Environment) for Chip Multiprocessors,” Proc. of the Workshop on Design, Architecture and Simulation of Chip Multiprocessors, November 2005.
[FV02] M. Fisk and G. Varghese, “Applying Fast String Matching to Intrusion Detection,” SEP’02, 2002.
[FW02] M. Franklin and T. Wolf, “A Network Processor Performance and Design Model with Benchmark Parameterization,” in Network Processor Workshop in conjunction with Eighth International Symposium on High Performance Computer Architecture (HPCA-8), February 2002.
[GKS03] M. Gries, C. Kulkarni, C. Sauer, and K. Keutzer, “Comparing Analytical Modeling with Simulation for Network Processors: A Case Study,” in Proc. of the Design, Automation, and Test in Europe (DATE), 2003.
[INTa] Intel IXP12XX Product Line of Network Processors, http://www.intel.com/ design/network/products/npfamily/ixp1200.htm.
[INTb] Intel IXP425 Network Processor, http://www.intel.com/design/ network/ products/npfamily/ixp425.htm.
[INTc] Intel XScale Microarchitecture, http://www.intel.com/design/ intelXScale.
[INT04] IXP2400 Data Sheet, Intel document number 301164-011, February 2004.
[JK03] E. J. Johnson and A. R. Kunze, “IXP2400/2800 Programming– The Complete Microengine Coding Guide,” Intel Press, April 2003.
[JS97] M. John and S. Smith, “Application-Specific Integrated Circuits,” Addison-Wesley Publishing Company, ISBN 0-201-50022-1, June 1997.
[JS99] M. John and S. Smith, “Application-Specific Integrated Circuits,” Addison-Wesley Publishing Company, ISBN 0-201-50022-1, June 1997.
[Kes95] Lawrence Kesteloot, “Porting BSD UNIX to a New Platform,” January 1995.
[LCL+07] Y.-N. Lin, Y.-C. Chang, Y.-D. Lin, and Y.-C. Lai, “Resource Allocation in Network Processors for Memory Access Intensive Applications,” to appear in the Journal of Systems and Software.
[Lek03] P. C. Lekkas, “Network Processors: Architectures, Protocols and Platforms (Telecom Engineering),” McGraw-Hill Professional, ISBN 0071409866, July 2003.
[LHC04] R.-T. Liu, N.-F. Huang, C.-H. Chen and C.-N. Kao, “a fast string-matching algorithm for network processor-based intrusion detection system,” ACM Transactions on Embedded Computing Systems, vol 3 issue 3, P.614-633, August 2004.
[LJ03] B.K. Lee and L.K. John, “NpBench: A Benchmark Suite for Control Plane and Data Plane Applications for Network Processors,” Proc. of the IEEE Int’l Conf. Computer Design (ICCD 03), 2003, pp. 226-233.
[LLP02] S. Lakshmanamurthy, K. Y. Liu, Y. Pun, L. Huston, and U. Naik, “Network Processor Performance Analysis Methodology,” Intel Technology Journal vol. 6 issue 3, 2002.
[LLL+05] Y.-N. Lin, C.-H. Lin, Y.-D. Lin and Y.-C. Lai, “VPN Gateways over Network Processors: Implementation and Evaluation,” Proc. of the 11th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'05), San Francisco, March 2005.
[LLY+03] Y. D. Lin, Y. N. Lin, S. C. Yang, and Y.S. Lin, “DiffServ Edge Routers over Network Processors: Implementation and Evaluation,” IEEE Network, Special Issue on Network Processors, July 2003.
[LW06] J. Lu and J. Wang, “Analytical performance analysis of network-processor-based application designs,” Proc. of the 15th International Conference on Computer Communications and Networks (IC3N06), Arlington, VA, Oct. 2006. IEEE Press, Pages 33-39.
[MOT] Motorola C-5 network processor, http://e-www.motorola.com/.
[Mur89] T. Murata, “Petri Nets: Properties, Analysis and Applications,” Proceedings of the IEEE, vol. 77, no. 4, 1989.
[Net] The NetBSD Project, http://www.netbsd.org/.
[NFS04] D. Nussbaum, A. Fedorova, and C. Small, “An Overview of the Sam CMT Simulator Kit,” Technical Report of Sun microsystems, June 2004.
[NGG93] S. S. Nemawarkar, R. Govindarajan, G. R. Gao, and V. K. Agarwal, “Analysis of Multithreaded Multiprocessor Architectures with Distributed Shared Memory”, Proc. of the Fifth IEEE Symposium on Parallel and Distributed Processing, Dallas, pp.114-121, 1993.
[NSH02] U. Naik, et al., “IXA Portability Framework: Preserving Software Investment in Network Processor Applications,” Intel Technology Journal, vol.6 issue 3, 2002.
[POS] POS PHY Level 3 Link Reference Design, http://www.latticesemi.com/products/devtools/ip/refdesigns/pos_phy.cfm.
[PRS04] W. Plishker, K. Ravindran, N. Shah, and K. Keutzer, “Automated Task Allocation on Single Chip, Hardware Multithreaded, Multiprocessor Systems,” Proc. of the Workshop on Embedded Parallel Architectures (WEPA-1), 2004.
[Roe] M. Roesh, “Snort: The open source network intrusion detection system,” http://www.snort.org.
[RJ03] S. T. G. S. Ramakrishna, H. S. Jamadagni, “Analytical Bounds on the Threads in IXP1200 Network Processor,” Proc. of the Euromicro Symposium on Digital System Design (DSD’03), pp. 426-429, 2003.
[RW03] R. Ramaswamy and T. Wolf, “PacketBench: A Tool for Workload Characterization of Network Processing,” Proc. of the 6th IEEE Annual Workshop on Workload Characterization, 2003.
[RWL+03] A. V. Ratzer et al., “CPN Tools for Editing, Simulating, and Analysing Coloured Petri Nets,” Proc. of the International Conference on Applications and Theory of Petri Nets, 2003.
[S-BCE90] R. S-B, D. Culler, and T. Eicken, “Analysis of multithreaded architectures for parallel computing,” Proc. of the 2nd Annual ACM Symposium. on Parallel Algorithms and Architectures, 1990.
[SKP01] T. Spalink, S. Karlin, L. Peterson, and Y. Gottlieb, “Building a Robust Software-Based Router Using Network Processors,” Proc. of the 18th ACM Symposium on Operating Systems Principles (SOSP), 2001.
[SMA03] K. Skadron, M. Martonosi, D. August, M. Hill, D. Lilja, and V. S. Pai, “Challenges in Computer Architecture Evaluation,” IEEE Computer, 2003.
[SPK 03] Niraj Shah, William Plishker, Kurt Keutzer, “NP-Click: A Programming Model for the Intel IXP1200,” Proc. of the 2nd Workshop on Network Processors (NP-2), held in conjuction with the 9th International Symposium on High Performance Computer Architecture (HPCA), 2003.
[TLY+04] Z. Tan, C. Lin, H. Yin, and B. Li, “Optimization and Benchmark of Cryptographic Algorithms on Network Processors,” IEEE Micro, vol. 24, no. 5, pp. 55-69, 2004.
[WF00] T. Wolf and M. Franklin, “CommBench: A Telecommunication Benchmark for Network Processors,” Proc. IEEE Int’l Symp. Performance Analysis of Systems and Software (ISPASS 00), IEEE Press, 2000, pp. 154-162.
[WF06] T. Wolf and M. K. Franklin, “Performance Models for Network Processor Design,” IEEE Transactions on Parallel and Distributed Systems, Vol. 17, No. 6, pp. 548-561, June 2006.
[WM94] S. Wu and U. Manber, “A fast algorithm for multi-pattern searching,” Technical Report TR94-17, Department of Computer Science, University of Arizona.
[WT01] T. Wolf and J. S. Turner, “Design Issues for High- Performance Active Routers,” IEEE Journal on Selected Areas in Communications, vol. 19, no. 3, 2001.
[ZGF98] W. M. Zuberek, R. Govindarajan, F. Suciu, “Timed Colored Petri net Models of Distributed Memory Multithreaded Multiprocessors,” Proc. of the Workshop on Practical Use of Coloured Petri Nets and Design, pages 253-270, Aarhus University, June 1998.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top