跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.173) 您好!臺灣時間:2025/01/18 03:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃柏堯
研究生(外文):Po-Yao Huang
論文名稱:高速且具擴增性 HPC Application 的叢集網路效能模擬器
論文名稱(外文):A Fast and Scalable Cluster Simulator for Network Performance Analysis of HPC Applications
指導教授:洪士灝洪士灝引用關係
口試日期:2017-07-21
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:47
中文關鍵詞:效能分析工具網路模擬器模擬器加速高效能計算網路拓樸
外文關鍵詞:Profiling toolNetwork SimulatorAccelerated SimulatorHigh-Performance ComputingNetwork Topology
相關次數:
  • 被引用被引用:0
  • 點閱點閱:195
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
電腦叢集是一種達成高效能計算(HPC)系統的方法,它是由一組電腦鬆散或緊密地連線在一起並且協同完成工作,他們可以被視為單一的系統。電腦叢集主要是處理大規模的科學問題和大量數據。在這大型的系統中,內部網路扮演著重要的角色。內部網路會影響HPC應用的效能。為了要達到最佳的效能,我們必須修改網路設定。然而要更改電腦叢集的網路設定或者拓普不是一件容易的事情,因此我們發展一套針對HPC應用設計的網路模擬器,幫助使用者剖析網路與應用效能之間的關聯性。

在這份研究論文中,我們建立了一套利用ns3網路模擬器去模擬與分析HPC應用的網路延遲與整體效能,並且研究加速模擬的技術,以縮短模擬大型系統所需的時間。在略微犧牲精確度的情況下,我們的加速技術可以大幅降低得到模擬結果所需的時間,並且產生有用的分析報告。加速後的模擬器最快可以達到18.5倍的加速。在和實體機器的比較中,模擬器的誤差可以在16.5%以下。最後,我們也將模擬的叢集擴大到64個節點,展示利用模擬器分析不同的網路拓普對於HPC應用的效能所造成的影響。
The computer cluster is combined with a set of loosely or tightly connected computers which work together. They can be regarded as a single system. The computer cluster is that solves the large scale scientific problems and processes the big data. In the large-scale system, interconnect plays an important role in overall performance. The interconnect have an impact on the performance of the HPC application. To achieve the best performance, we need to modify the network settings or topology. However, modified the network settings or topology is difficult in the computer cluster. Hence, we need the network simulator.


In this study, we developed the simulator base on the ns-3 to simulate the network latency of the HPC applications and analyze the network performance. Also, we researched the acceleration of the simulator for reducing the simulation time. The simulator can rapidly provide the simulated performance metrics of a large scale system with the appropriate trade-off between simulation accuracy and simulation time. The simulator maximum speedup with acceleration simulation can be 18.5 times. Compared with the physical machine, the overall error rates of the simulation can be less than 16.5%. Finally, the simulator can predict the network performance of a rack scale system of up to 64 nodes.
誌謝 i
摘要 ii
Abstract iii
Chapter 1 Introduction 1
Chapter 2 Background and Related Work 5
2.1 High Performance Computing Applications . . . . . . . . . . . . . . . . 5
2.1.1 High Performance Computing . . . . . . . . . . . . . . . . . . . 5
2.1.2 Message Passing Interface . . . . . . . . . . . . . . . . . . . . . 6
2.2 High Performance Computing Networks . . . . . . . . . . . . . . . . . . 6
2.2.1 InfiniBand versus Ethernet . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 End-to-end delay . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.3 Fat Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.4 3D Torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Simulation of Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 ns-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Accelerated Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.1 dist-gem5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.2 CloudSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5.3 NTU-DSI-DCN . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5.4 DCNSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5.5 XSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Chapter 3 Methodology 12
3.1 Concurrency Preservation of Network Traces . . . . . . . . . . . . . . . 12
3.1.1 Network Traces Collects from Virtual Cluster . . . . . . . . . . . 12
3.1.2 Per-flow Timing Components . . . . . . . . . . . . . . . . . . . 14
3.1.3 Concurrency Preservation . . . . . . . . . . . . . . . . . . . . . 14
3.2 Trace-driven Network Simulation . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Traffic Matrix Preprocessing . . . . . . . . . . . . . . . . . . . . 16
3.2.2 NS-3 Simulation Procedure . . . . . . . . . . . . . . . . . . . . 17
3.3 Accelerated Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Chapter 4 Evaluation 21 4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Evaluation of Single-thread Multi-node Simulation . . . . . . . . . . . . 23
4.4 Evaluation of Multi-thread Multi-node Simulation . . . . . . . . . . . . . 23
4.5 Accelerated Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6 Case study : 64-Node Rack Scale Simulation . . . . . . . . . . . . . . . 25
Chapter 5 Conclusion and Future Work 30
Bibliography 33
[1] Message passing interface (mpi). https://computing.llnl.gov/tutorials/ mpi/, 2012. [Online; accessed 31-August-2017].
[2] Fat-tree design. http://clusterdesign.org/fat-trees/, 2013. [Online; accessed 19-August-2017].
[3] Wiki:fat tree. https://en.wikipedia.org/wiki/Fat_tree, 2013. [Online; accessed 19-August-2017].
[4] 40gb ethernet: A competitive alternative to infiniband. https://www.chelsio. com/wp-content/uploads/2013/11/40Gb-Ethernet-A-CompetitiveAlternative-to-InfiniBand.pdf, 2014. [Online; accessed 19-August-2017].
[5] 25, 50 and 100 gigabit ethernet in the data center. http://www. networkcomputing.com/data-centers/25-50-and-100-gigabitethernet-data-center/1422885308, 2015. [Online; accessed 23-August- 2017].
[6] Testing high-performance computing applications. https://en.wikipedia.org/ wiki/Testing_high-performance_computing_applications, 2016. [Online; accessed 30-August-2017].
[7] High performance computing (hpc). https://aws.amazon.com/hpc/, 2017. [Online; accessed 30-August-2017].
[8] Infiniband. https://en.wikipedia.org/wiki/InfiniBand#cite_note-1, 2017. [Online; accessed 19-August-2017]. 33 doi:10.6342/NTU201703980
[9] Intel® trace analyzer and collector. https://software.intel.com/en-us/ intel-trace-analyzer, 2017. [Online; accessed 9-August-2017].
[10] ns-3. https://www.nsnam.org/, 2017. [Online; accessed 31-August-2017].
[11] Sdsc gordon user guide. https://portal.xsede.org/sdsc-gordon, 2017. [Online; accessed 31-August-2017].
[12] Simpy. https://simpy.readthedocs.io/en/latest/, 2017. [Online; accessed 9-August-2017].
[13] Torus interconnect. https://en.wikipedia.org/wiki/Torus_interconnect, 2017. [Online; accessed 31-August-2017].
[14] What is high performance computing? https://insidehpc.com/hpc-basictraining/what-is-hpc/, 2017. [Online; accessed 30-August-2017].
[15] D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, et al. The nas parallel benchmarks. The International Journal of Supercomputing Applications, 5(3):63– 73, 1991.
[16] R. Barr, Z. J. Haas, and R. van Renesse. Jist: An efficient approach to simulation using virtual machines. Software: Practice and Experience, 35(6):539–576, 2005.
[17] A. Bhatele, N. Jain, Y. Livnat, V. Pascucci, and P.-T. Bremer. Analyzing network health and congestion in dragonfly-based supercomputers. In Parallel and Distributed Processing Symposium, 2016 IEEE International, pages 93–102. IEEE, 2016.
[18] S. Böhm and C. Engelmann. xsim: The extreme-scale simulator. In High Performance Computing and Simulation (HPCS), 2011 International Conference on, pages 280–286. IEEE, 2011. 34 doi:10.6342/NTU201703980
[19] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. De Rose, and R. Buyya. Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and experience, 41(1):23–50, 2011.
[20] M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1):143–177, 2004.
[21] S. K. Garg and R. Buyya. Networkcloudsim: Modelling parallel applications in cloud simulations. In Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on, pages 105–113. IEEE, 2011.
[22] W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A high-performance, portable implementation of the mpi message passing interface standard. Parallel computing, 22(6):789–828, 1996.
[23] S. Hanks, D. Meyer, D. Farinacci, and P. Traina. Generic routing encapsulation (gre). 2000.
[24] F. Howell and R. McNab. Simjava: A discrete event simulation library for java. Simulation Series, 30:51–56, 1998.
[25] T. Issariyakul and E. Hossain. Introduction to network simulator NS2. Springer Science & Business Media, 2011.
[26] N. Jain, A. Bhatele, S. White, T. Gamblin, and L. V. Kale. Evaluating hpc networks via simulation of parallel workloads. In High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for, pages 154–165. IEEE, 2016.
[27] I. S. Jones and C. Engelmann. Simulation of large-scale hpc architectures. In Parallel Processing Workshops (ICPPW), 2011 40th International Conference on, pages 447– 456. IEEE, 2011. 35 doi:10.6342/NTU201703980
[28] J. F. Kurose and K. W. Ross. Computer networking: a top-down approach, volume 4. Addison Wesley Boston, USA, 2009.
[29] C. E. Leiserson. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE transactions on Computers, 100(10):892–901, 1985.
[30] Y. Liu and J. Muppala. Dcnsim: A data center network simulator. In Distributed Computing Systems Workshops (ICDCSW), 2013 IEEE 33rd International Conference on, pages 214–219. IEEE, 2013.
[31] A. Mohammad, U. Darbaz, G. Dozsa, S. Diestelhorst, D. Kim, and N. S. Kim. distgem5: Distributed simulation of computer clusters. In Performance Analysis of Systems and Software (ISPASS), 2017 IEEE International Symposium on, pages 153– 162. IEEE, 2017.
[32] E. Perelman, G. Hamerly, M. Van Biesbrouck, T. Sherwood, and B. Calder. Using simpoint for accurate and efficient simulation. In ACM SIGMETRICS Performance Evaluation Review, volume 31, pages 318–319. ACM, 2003.
[33] B. Pfaff, J. Pettit, T. Koponen, E. J. Jackson, A. Zhou, J. Rajahalme, J. Gross, A. Wang, J. Stringer, P. Shelar, et al. The design and implementation of open vswitch. In NSDI, pages 117–130, 2015.
[34] D. Sanchez and C. Kozyrakis. Zsim: fast and accurate microarchitectural simulation of thousand-core systems. In ACM SIGARCH Computer Architecture News, volume 41, pages 475–486. ACM, 2013.
[35] J. Subhlok, S. Venkataramaiah, and A. Singh. Characterizing nas benchmark performance on shared heterogeneous networks. In Parallel and Distributed Processing Symposium., Proceedings International, IPDPS 2002, Abstracts and CD-ROM, pages 9–pp. IEEE, 2001.
[36] A. Varga and R. Hornig. An overview of the omnet++ simulation environment. In Proceedings of the 1st international conference on Simulation tools and techniques 36 doi:10.6342/NTU201703980 for communications, networks and systems & workshops, page 60. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2008.
[37] E. Weingartner, H. Vom Lehn, and K. Wehrle. A performance comparison of recent network simulators. In Communications, 2009. ICC’09. IEEE International Conference on, pages 1–5. IEEE, 2009.
[38] D. Wong, K. T. Seow, C. H. Foh, and R. Kanagavelu. Towards reproducible performance studies of datacenter network architectures using an open-source simulation approach. In Global Communications Conference (GLOBECOM), 2013 IEEE, pages 1373–1378. IEEE, 2013.
[39] J. Wu, Z. Gao, H. Sun, and H. Huang. Congestion in different topologies of traffic networks. EPL (Europhysics Letters), 74(3):560, 2006.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top