跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.152) 您好!臺灣時間:2025/11/01 23:18
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:歐韋伸
研究生(外文):Wei-Shen Ou
論文名稱:整合PCI穿透技術於GPU與InfiniBand虛擬化於雲端虛擬叢集建置
論文名稱(外文):Construction of a Virtual Cluster by Integrating PCI Pass-Through for GPU and InfiniBand Virtualization in Cloud
指導教授:楊朝棟楊朝棟引用關係
指導教授(外文):Chao-Tung Yang
口試委員:李冠憬朱正忠薛念林劉榮春
口試委員(外文):Kuan-Ching LiCheng-Chung ChuNien-Lin HsuehJung-Chun Liu
口試日期:2013-07-02
學位類別:碩士
校院名稱:東海大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:英文
論文頁數:114
中文關鍵詞:虛擬叢集CUDAInfiniBand虛擬化MPIKVMXENVMwareHPL
外文關鍵詞:Virtual ClusterCUDAInfiniBandvirtualizationMPIKVMXENVMwareHPL
相關次數:
  • 被引用被引用:0
  • 點閱點閱:1275
  • 評分評分:
  • 下載下載:21
  • 收藏至我的研究室書目清單書目收藏:0
現今,NVIDIA公司的CUDA是一種為了撰寫高度平行的應用程式。它需要一些平行化建構概念:一個有層次的線程區塊、共享式的記憶體和屏障同步。使用CUDA開發的程式,可以達到驚人的加速性。圖形處理器則能在叢集環境的雲端運算中扮演一個重要的角色。因為這樣將可以建置一個更強大的高效能運算環境。
在雲端的架構中,虛擬化是很重要的一環,虛擬機器使用NVIDIA的顯示卡,進而可以使用CUDA高效能運算。這將使得虛擬機器不僅能有虛擬的中央處理器,更可以使用實體的圖形處理器來做運算,虛擬機器的效能將可大幅提升。InfiniBand 是一種擁有高頻寬、低延遲的高速網路,所以它被廣泛的用於高效能運算的領域中。Xen、KVM、VMware 都是我們所熟悉的虛擬化平台,它們有很好的效能和穩定性。我希望在這些虛擬平台上,利用很多張的圖形處理器,利用它們強大的運算能力,和比乙太網路更快的Infiniband做為傳輸的媒介,來達成一個高效能的叢集運算環境。我在實驗的最後用HPL來顯示這叢集運算環境的能力。
At present, NVIDIA's CUDA can support programmers to develop highly parallel applications. It utilizes some parallel construct concepts: hierarchical thread blocks, shared memory, and barrier synchronization. CUDA development programs can be used to achieve amazing acceleration. The graphics processor is able to play an important role in cloud computing in a cluster environment, because it can be used to build a high-performance computing environment.
In the cloud architecture, virtualization plays a very important part. Any virtual machine built with the NVIDIA graphics card will have the CUDA high-performance computing ability. This makes the virtual machine have not only virtual CPUs, but also physical graphics processors to do computations, resulting in much improvement in performance of the virtual machine. InfiniBand, a high-bandwidth and low-latency high-speed interconnect, is widely used in the field of high performance computing. Xen, KVM, and VMware are well-known virtualization platforms that have good performance and stability. In this work, a high-performance cluster computing environment is achieved on Xen, KVM, and VMware virtual platforms, with graphics processors that have powerful computing capabilities, and Infiniband that is faster than Ethernet as the transmission medium. Finally, the High Performance Linpack (HPL) benchmark is used in the experiment to test the computing capability of the cluster computing environment.
摘要I
Abstract II
致謝詞III
Table of Contents V
List of Figures VII
List of Tables IX
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Thesis Goal and Contributions . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Backgroud Review and Related Work 4
2.1 CUDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 MPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Infiniband . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Full-Virtualization . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Para-Virtualization . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.3 KVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.4 XEN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Virtualization on GPU . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Hypervisor-based device emulation . . . . . . . . . . . . . . 19
2.5.2 User space device emulation . . . . . . . . . . . . . . . . . . 20
2.5.3 Passthrough within the hypervisor . . . . . . . . . . . . . . 21
2.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 System Design and Implementation 24
3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Graphic Processing Unit . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 Tesla C2050/C2070/C2075 Computing Processor Board . . 27
3.2.2 Tesla K20 Computing Processor Board . . . . . . . . . . . . 28
3.3 System Implementation - Setting PCI-Passthrough . . . . . . . . . 31
4 Experimental Results 37
4.1 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.1 Experimental Environment - InfiniBand Test Case . . . . . 40
4.1.2 Experimental Environment - GPU Test Case . . . . . . . . 41
4.1.3 Experimental Environment - HPL . . . . . . . . . . . . . . . 42
4.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 Results and Discussion - InfiniBand Test Case . . . . . . . . 44
4.2.2 Results and Discussion - GPU Test Case . . . . . . . . . . . 50
4.2.3 Results and Discussion - HPL . . . . . . . . . . . . . . . . . 61
5 Conclusions and Future Work 69
5.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Bibliography 71
Appendix 78
A Set PCI-Passthrough on Xen 78
A.1 Setup Xen on CentOS . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.2 Set PCI passthrough . . . . . . . . . . . . . . . . . . . . . . . . . . 79
B Set PCI-Passthrough on KVM 81
B.1 Change Linux-Kernel on CentOS . . . . . . . . . . . . . . . . . . . 81
B.2 Set PCI passthrough . . . . . . . . . . . . . . . . . . . . . . . . . . 82
C Set PCI-Passthrough on VMware vSphere 5.1 84
C.1 GPU Driver on VWware-vSphere-ESXi-5.1 . . . . . . . . . . . . . . 84
C.2 Mellanox InfiniBand OFED Driver for VMware vSphere 5.1 . . . . 85
C.3 Infiniband SRP Target on CentOS 6 . . . . . . . . . . . . . . . . . 85
D CUDA Installation 91
E Infiniband Installation 93
F Parallel Studio XE 2013 Installation 95
G HPL Installation 96
Bibliography
[1] Cuda. http://www.nvidia.com.tw/object/cuda_home_new_tw.html.
[2] Download cuda. http://developer.nvidia.com/object/cuda.htm.
[3] Nvidia cuda programming guide. http://developer.download.nvidia.com/
compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf.
[4] Nvidia cuda sdk. http://developer.nvidia.com/cuda-cc-sdk-code-samples.
[5] nvidia. http://www.nvidia.com.
[6] Gpgpu. http://en.wikipedia.org/wiki/GPGPU.
[7] Opencl. http://www.khronos.org/opencl/.
[8] Opencl-wiki. http://en.wikipedia.org/wiki/OpenCL.
[9] Full virtualization. http://en.wikipedia.org/wiki/Full_virtualization.
[10] Para virtualization. http://en.wikipedia.org/wiki/Paravirtualization.
[11] Kvm. http://www.linux-kvm.org/page/Main_Page.
[12] Xen. http://www.xen.org/.
[13] Qemu. http://wiki.qemu.org/Main_Page.
[14] Pci-pass-through. http://www.ibm.com/developerworks/linux/library/l-pcipassthrough.
[15] J. Duato, A.J. Pena, F. Silla, J.C. Fernandez, R. Mayo, and E.S. Quintana-
Orti. Enabling cuda acceleration within virtual machines using rcuda. In
High Performance Computing (HiPC), 2011 18th International Conference
on, pages 1–10, 2011.
[16] J. Duato, A.J. Pena, F. Silla, R. Mayo, and E.S. Quintana-Orti. Performance
of cuda virtualized remote gpus in high performance clusters. In Parallel
Processing (ICPP), 2011 International Conference on, pages 365–374, 2011.
[17] J. Duato, A.J. Pena, F. Silla, R. Mayo, and E.S. Quintana-Orti. rcuda:
Reducing the number of gpu-based accelerators in high performance clusters.
In High Performance Computing and Simulation (HPCS), 2010 International
Conference on, pages 224–231, 2010.
[18] Lin Shi, Hao Chen, Jianhua Sun, and Kenli Li. vcuda: Gpu-accelerated highperformance
computing in virtual machines. Computers, IEEE Transactions
on, 61(6):804–816, 2012.
[19] Vishakha Gupta, Ada Gavrilovska, Karsten Schwan, Harshvardhan Kharche,
Niraj Tolia, Vanish Talwar, and Parthasarathy Ranganathan. Gvim: Gpuaccelerated
virtual machines. In Proceedings of the 3rd ACM Workshop on
System-level Virtualization for High Performance Computing, HPCVirt ’09,
pages 17–24, New York, NY, USA, 2009. ACM.
[20] Giulio Giunta, Raffaele Montella, Giuseppe Agrillo, and Giuseppe Coviello.
A gpgpu transparent virtualization component for high performance computing
clouds. In Proceedings of the 16th international Euro-Par conference on
Parallel processing: Part I, EuroPar’10, pages 379–391, Berlin, Heidelberg,
2010. Springer-Verlag.
[21] Front and back ends. http://en.wikipedia.org/wiki/Front_and_back_ends.
[22] Yi-Man Ma, Che-Rung Lee, and Yeh-Ching Chung. Infiniband virtualization
on kvm. 4th IEEE International Conference on Cloud Computing Technology
and Science Proceedings, 0:777–781, 2012.
[23] M.E. Kanal and M. Demiralp. A modified method of calculating high dimensional
model representation (hdmr) terms for parallelization with mpi and
cuda. The Journal of Supercomputing, 62(1):199–213, 2012.
[24] P. Alonso, R. Cortina, F.J. Martínez-Zaldívar, and J. Ranilla. Neville elimination
on multi- and many-core systems: Openmp, mpi and cuda. The Journal
of Supercomputing, 58(2):215–225, 2011.
[25] Yue-Shan Chang, Ruey-Kai Sheu, Shyan-Ming Yuan, and Jyn-Jie Hsu. Scaling
database performance on gpus. Information Systems Frontiers, 14(4):
909–924, 2012.
[26] T.D. Han and T.S. Abdelrahman. hicuda: High-level gpgpu programming.
Parallel and Distributed Systems, IEEE Transactions on, 22(1):78–90, 2011.
[27] Virtualbox. https://www.virtualbox.org/.
[28] Virtualization. http://en.wikipedia.org/wiki/Virtualization.
[29] National institute of standards and technology. http://www.nist.gov/index.
html.
[30] Cloud computing. http://en.wikipedia.org/wiki/Cloud_computing.
[31] Top 500. http://www.top500.org/.
[32] Mellanox. http://www.mellanox.com/index.php.
[33] Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Chaitanya Kandalla, Mark Daniel
Arnold, and Dhabaleswar K. Panda. Sr-iov support for virtualization on
infiniband clusters: Early experience. In Cluster, Cloud and Grid Computing
(CCGrid), 2013 13th IEEE/ACM International Symposium on, pages 385–
392, 2013.
[34] H. Subramoni, S. Potluri, K. Kandalla, B. Barth, J. Vienne, J. Keasler,
K. Tomko, K. Schulz, A. Moody, and D.K. Panda. Design of a scalable
infiniband topology service to enable network-topology-aware placement of
processes. In High Performance Computing, Networking, Storage and Analysis
(SC), 2012 International Conference for, pages 1–12, 2012.
[35] H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko, R. McLay,
K. Schulz, and D.K. Panda. Design and evaluation of network topology-/
speed- aware broadcast algorithms for infiniband clusters. In Cluster Computing
(CLUSTER), 2011 IEEE International Conference on, pages 317–325,
2011.
[36] K. Kandalla, H. Subramoni, J. Vienne, S.P. Raikar, K. Tomko, S. Sur, and
D.K. Panda. Designing non-blocking broadcast with collective offload on
infiniband clusters: A case study with hpl. In High Performance Interconnects
(HOTI), 2011 IEEE 19th Annual Symposium on, pages 27–34, 2011.
[37] J. Vienne, J. Chen, M. Wasi-ur Rahman, N.S. Islam, H. Subramoni, and D.K.
Panda. Performance analysis and evaluation of infiniband fdr and 40gige roce
on hpc and cloud computing systems. In High-Performance Interconnects
(HOTI), 2012 IEEE 20th Annual Symposium on, pages 48–55, 2012.
[38] N.S. Islam, M.W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni,
C. Murthy, and D.K. Panda. High performance rdma-based design of
hdfs over infiniband. In High Performance Computing, Networking, Storage
and Analysis (SC), 2012 International Conference for, pages 1–12, 2012.
[39] Jian Huang, Xiangyong Ouyang, J. Jose, M. Wasi-ur Rahman, Hao Wang,
Miao Luo, H. Subramoni, C. Murthy, and D.K. Panda. High-performance
design of hbase with rdma over infiniband. In Parallel Distributed Processing
Symposium (IPDPS), 2012 IEEE 26th International, pages 774–785, 2012.
[40] S. Sur, S. Potluri, K. Kandalla, H. Subramoni, D.K. Panda, and K. Tomko.
Codesign for infiniband clusters. Computer, 44(11):31–36, 2011.
[41] C. Reano, A.J. Pea, F. Silla, J. Duato, R. Mayo, and E.S. Quintana-Orti.
Cu2rcu: Towards the complete rcuda remote gpu virtualization and sharing
solution. In High Performance Computing (HiPC), 2012 19th International
Conference on, pages 1–10, 2012.
[42] M.S. Vinaya, N. Vydyanathan, and M. Gajjar. An evaluation of cuda-enabled
virtualization solutions. In Parallel Distributed and Grid Computing (PDGC),
2012 2nd IEEE International Conference on, pages 621–626, 2012.
[43] Chao-Tung Yang, Hsien-Yi Wang, Wei-Shen Ou, Yu-Tso Liu, and Ching-
Hsien Hsu. On implementation of gpu virtualization using pci pass-through.
In Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th
International Conference on, pages 711–716, 2012.
[44] F.N. Almari, P. Zavarsky, R. Ruhl, D. Lindskog, and A. Aljaedi. Performance
analysis of oracle database in virtual environments. In Advanced Information
Networking and Applications Workshops (WAINA), 2012 26th International
Conference on, pages 1238–1245, 2012.
[45] R. Owens and Weichao Wang. Non-interactive os fingerprinting through memory
de-duplication technique in virtual machines. In Performance Computing
and Communications Conference (IPCCC), 2011 IEEE 30th International,
pages 1–8, 2011.
[46] O. Sukwong, A. Sangpetch, and H.S. Kim. Sageshift: Managing slas for highly
consolidated cloud. In INFOCOM, 2012 Proceedings IEEE, pages 208–216,
2012.
[47] M. Ahmed and Yang Xiang. Trust ticket deployment: A notion of a data
owner’s trust in cloud computing. In Trust, Security and Privacy in Computing
and Communications (TrustCom), 2011 IEEE 10th International Conference
on, pages 111–117, 2011.
[48] Xiaofei Huang, Xiaoying Bai, and Richard M. Lee. An empirical study of
vmm overhead, configuration performance and scalability. In Service Oriented
System Engineering (SOSE), 2013 IEEE 7th International Symposium on,
pages 359–366, 2013.
[49] P. Muditha Perera and Chamath Keppitiyagama. A performance comparison
of hypervisors. In Advances in ICT for Emerging Regions (ICTer), 2011
International Conference on, pages 120–120, 2011.
[50] G. Kukreja and S. Singh. Virtio based transcendent memory. In Computer
Science and Information Technology (ICCSIT), 2010 3rd IEEE International
Conference on, volume 1, pages 723–727, 2010.
[51] R. Shea and Jiangchuan Liu. Understanding the impact of denial of service
attacks on virtual machines. In Quality of Service (IWQoS), 2012 IEEE 20th
International Workshop on, pages 1–9, 2012.
[52] Jiuxing Liu. Evaluating standard-based self-virtualizing devices: A performance
study on 10 gbe nics with sr-iov support. In Parallel Distributed
Processing (IPDPS), 2010 IEEE International Symposium on, pages 1–12,
2010.
[53] Zhaoliang Guo and Qinfen Hao. Optimization of kvm network based on cpu
affinity on multi-cores. In Information Technology, Computer Engineering and
Management Sciences (ICM), 2011 International Conference on, volume 4,
pages 347–351, 2011.
[54] N. Regola and J.-C. Ducom. Recommendations for virtualization technologies
in high performance computing. In Cloud Computing Technology and Science
(CloudCom), 2010 IEEE Second International Conference on, pages 409–416,
2010.
[55] I. Tafa, E. Beqiri, H. Paci, E. Kajo, and A. Xhuvani. The evaluation of transfer
time, cpu consumption and memory utilization in xen-pv, xen-hvm, openvz,
kvm-fv and kvm-pv hypervisors using ftp and http approaches. In Intelligent
Networking and Collaborative Systems (INCoS), 2011 Third International
Conference on, pages 502–507, 2011.
[56] D. Petrovic and A. Schiper. Implementing virtual machine replication: A
case study using xen and kvm. In Advanced Information Networking and
Applications (AINA), 2012 IEEE 26th International Conference on, pages
73–80, 2012.
[57] S3544-3d-apps-vmware-horizon-view. http:// on-demand.gputechconf.com/
gtc/2013/presentations/S3544-3D-Apps-VMware-Horizon-View.pdf.
[58] S3355-deploying-grid-citrix-vmware-vd-environments. http:// ondemand.
gputechconf.com/ gtc/ 2013/ presentations/ S3355-Deploying-GRIDCitrix-
VMWare-VD-Environments.pdf.
[59] Directx-wiki. http://en.wikipedia.org/wiki/DirectX.
[60] Opengl-wiki. http://en.wikipedia.org/wiki/OpenGL.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊