臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.106) 您好！臺灣時間：2026/04/04 07:51

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

白托馬

研究生(外文):

Thomas, Pak

論文名稱:

混熱擾流力與分子動能模擬之 GPU 加速

論文名稱(外文):

GPU-Acceleration of the Hybrid Fluctuating Hydrodynamics and Molecular Dynamics Simulation

指導教授:

鍾崇斌、朱智瑋

指導教授(外文):

Chung, Chung-Ping、Chu, Jhih-Wei

口試委員:

鍾崇斌、朱智瑋、渡邊浩志

口試委員(外文):

Chung, Chung-Ping、Chu, Jhih-Wei、Hiroshi, Watanabe

口試日期:

2016-07-29

學位類別:

碩士

校院名稱:

國立交通大學

系所名稱:

電機資訊國際學程

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2016

畢業學年度:

104

語文別:

英文

論文頁數:

中文關鍵詞:

GPU、混合模型、分子動能模擬、混熱擾流力

外文關鍵詞:

GPU、hybrid model、molecular dynamics、fluctuating hydrodynamics

相關次數:

被引用:0
點閱:149
評分:
下載:11
書目收藏:0

分子層級的流體性質經常以全原子模擬(all-atom simulations)的方法進行研究,該方法以古典力學描述每一個原子的動態行為,因此為目標研究系統提供最清楚的動態資訊。另一方面,流體在巨觀的行為則視流體為一連續的變量,並藉由流體動力學方程式(hydrodynamic equations)預測隨時間下流體的表現。奈米層級下的探討則需要結合介於上述兩種尺度下的模型來進行分析。此一新的耦合系統對每個粒子配置了與其相對應的網格。粒子對場變數、或反之場變數對粒子的配對是透過粒子與場網格之內插來計算。然而此一耦合演算法尚未建立於高效能運算架構之下。

近年來,圖形處理器(graphics processing units, GPUs)儼然成為科學計算上富競爭的平台。原本GPU是為電腦繪圖而設計,但是GPU架構現在被優化於處理計算密集型任務與高通量資料。比起傳統的高效能運算叢集系統,上述特性使得GPU更為吸引人且計算更有效率。因此新的運算架構即選擇了通用圖形處理器(GPU–CPU)之架構來進行混合模型之高效能運算。

本篇碩士論文討論以兩種力學的混合模型在GPU加速模擬下的設計與實踐。目的是要將原先之CPU演算架構改建置於可大量同步控制、以達到最高加速運算可能性之GPU演算架構。此一新的GPU演算架構使用共享記憶體為暫存區域以對耦合系統進行快速地局部內插運算。兩階段執行緒對應將額外內存空間之使用降到最低以達到最大限度地提高運算處理量。藉由徹底地增加模擬的計算效率,利用混合模型所探索的時空間尺度將可以被大幅的增加。

Fluid properties at the molecular scale are often investigated using all-atom simulations, which provide the highest level of detail attainable using classical mechanics. On the other hand, the behavior of fluids at the macroscopic scale is modeled by approximating the fluid as a continuous quantity and tracking its evolution by hydrodynamic equations. At the nanoscale both of these modeling paradigms are necessary. A hybrid model implementing molecular dynamics and hydrodynamics has previously been designed for simulations of nanoscale fluids. It implements a novel coupling scheme that associates a collocating grid with each particle. The mapping of particle to field variables and vice versa is then achieved through interpolation of particle and field grids. However, the coupling algorithm has not yet been adapted for high-performance computing (HPC).

In recent years, graphics processing units (GPUs) have emerged as a competitive platform for scientific computations. Originally designed for computer graphics, the GPU architecture is optimized for computationally intensive tasks and high data throughput. These features make them attractive and cost-effective alternatives compared to traditional HPC clusters. Therefore, a GPU–CPU framework was chosen as the HPC platform for the hybrid model.

This thesis thus presents the design and implementation of a GPU-accelerated simulation of the hybrid model. The objective was to reformulate the original CPU algorithms to expose massive concurrency, implement them on the GPU and achieve the highest computational speedup possible. A novel GPU algorithm was designed for the coupling scheme that uses shared memory as a staging area to perform fast local interpolations. To maximize computational throughput, a two-stage thread mapping was employed with a minimal amount of additional memory overhead. By drastically increasing the computational efficiency of simulations, the spatial and temporal scales that can be explored using the hybrid model were greatly expanded.

Chinese Abstract i
English Abstract ii
Acknowledgements iii
Table of Contents v
List of Figures viii
List of Tables ix
List of Listings x
I Introduction 1
II Mathematical Modeling of Complex Molecular Systems at the Nanoscale 4
2.1 Fluctuating Hydrodynamics 4
2.1.1 Governing Equations of Conservation Laws 5
2.1.2 Constitutive Equations 5
2.1.3 Conservation of Mass 6
2.1.4 Conservation of Momentum 6
2.1.5 The Stress Tensor 7
2.1.6 The Fluctuating Stress Tensor 8
2.1.7 Summary 9
2.2 Molecular Dynamics 9
2.2.1 Equations of Motion 10
2.2.2 Force Fields 10
2.2.3 Periodic Boundary Conditions 11
2.3 The Particle–Solvent Coupling Scheme 12
2.3.1 Free-Energy Densities 12
2.3.2 Friction Forces 15
2.4 Electrostatic Interactions 16
2.4.1 Charged Particles 16
2.4.2 The Polarization Density 17
2.4.3 Poisson’s Equation 18
2.5 Summary 19
III Numerical Solution of the Hybrid Model 23
3.1 Auxiliary Fluids 23
3.2 Staggered Grids 24
3.2.1 Definitions 25
3.2.2 Discretization over Staggered Grids 26
3.2.3 Calculation of Gradients and Divergences 28
3.2.4 Calculation of Laplacians 30
3.2.5 Multiplication and Division of Scalar and Vector Fields 31
3.3 Lagrangian Grids 31
3.3.1 Occupied Volumes 32
3.3.2 Particle –Solvent Forces 34
3.4 Numerical Solution of Poisson’s Equation 35
3.4.1 The Fourier Transform 35
3.4.2 The Discrete Fourier Transform 36
3.4.3 The Fast Fourier Transform Algorithm 37
3.5 Neighbor Lists 37
3.5.1 Force Truncation 38
3.5.2 Cell Lists 38
3.6 Stochastic Fluxes 39
3.7 Temporal Propagation 40
IV GPU Architecture and the CUDA Execution Model 43
4.1 Overview of the GPU Architecture 43
4.2 The CUDA Execution Model 44
4.2.1 CUDA Threads 44
4.2.2 Thread Blocks and Grids 45
4.2.3 Streaming Multiprocessors 47
4.2.4 Branch Divergence 48
4.3 The CUDA Memory Model 49
4.3.1 The Memory Hierarchy 50
4.3.2 Global Memory and Local Memory 51
4.3.3 Shared Memory 52
4.3.4 Constant Memory 53
4.3.5 Atomic Operations 53
4.3.6 Address Coalescing 54
V Design and Implementation of the GPU-Accelerated Simulation 55
5.1 Design Objectives and Principles 55
5.1.1 Performance 55
5.1.2 Memory Efficiency 57
5.1.3 Scalability 57
5.1.4 Modularity 57
5.2 The Grid-Stride Loop 58
5.3 Grid Tiling 61
5.3.1 Tiling Strategy 63
5.3.2 The Block-Stride Loop 64
5.3.3 The Tile-Stride Loop 64
5.3.4 Summary 65
5.4 Neighbor List Generation 66
5.4.1 Construction of Cell Lists 66
5.4.2 Construction of Neighbor Lists 67
5.4.3 List Overflow 67
5.5 The Interpolation Scheme from Lagrangian to Eulerian Grids 68
5.5.1 Original Implementation 68
5.5.2 GPU Implementation 71
5.5.3 Auxiliary Data for Interpolation 74
5.5.4 Summary 80
5.6 GPU-Accelerated Libraries 81
VI Results and Discussion 83
6.1 Experimental Setup 83
6.2 Fluctuating Hydrodynamics Performance 83
6.3 Neighbor List Performance 89
6.4 Occupied Volume Interpolation Performance 91
6.5 Simulation Performance 92
VII Conclusion 94
Bibliography 95

[1] M. P. Allen and D. J. Tildesley. “Computer simulation of liquids”. Oxford: Clarendon Press, 1987.
[2] J. A. Anderson, C. D. Lorenz, and A. Travesset. “General purpose molecular dynamics simulations fully implemented on graphics processing units”. In: “Journal of Computational Physics” 227.10 (2008), pp. 5342–5359.
[3] Boost Community. “Boost C++ Libraries”. url: http://www.boost.org/ (visited on 07/25/2016).
[4] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus. “CHARMM: A program for macromolecular energy, minimization, and dynamics calculations”. In: “Journal of Computational Chemistry” 4.2 (Jan. 1983), pp. 187–217.
[5] J. Cheng, M. Grossman, and T. McKercher. “Professional CUDA C Programming”. Indianapolis, Indiana: John Wiley & Sons, Inc., 2014.
[6] W. M. Deen. “Analysis of transport phenomena”. Oxford: Oxford University Press, 2013.
[7] Free Software Foundation Inc. “GNU Compiler Collection”. 2016. url: https://gcc.gnu.org/.
[8] D. Frenkel and B. Smit. “Understanding molecular simulation: from algorithms to applications”. New York: Academic Press, 2002.
[9] J. Glaser, T. D. Nguyen, J. A. Anderson, P. Lui, F. Spiga, J. A. Millan, D. C. Morse, and S. C. Glotzer. “Strong scaling of general-purpose molecular dynamics simulations on GPUs”. In: “Computer Physics Communications” 192 (2015), pp. 97–107.
[10] A. Grama, A. Gupta, G. Karypis, and V. Kumar. “Introduction to Parallel Computing; 2nd Edition”. Upper Saddle River, New Jersey: Addison-Wesley, 2003.
[11] M. Harris. “CUDA Pro Tip: Write Flexible Kernels with Grid-Stride Loops”. 2013. url: https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/ (visited on 07/25/2016).
[12] M. J. Harris. “Fast Fluid Dynamics Simulation on the GPU”. In: “GPU gems: programming techniques, tips, and tricks for real-time graphics”. Upper Saddle River, New Jersey: Addison-Wesley, 2004.
[13] J. L. Hennessy and D. A. Patterson. “Computer architecture: a quantitative approach”. Boston: Elsevier, 2012.
[14] D. Kirk and W.-M. W. Hwu. “Programming Massively Parallel Processors: A Hands-on Approach”. Boston: Elsevier, 2010.
[15] NVIDIA Corporation. “CUDA C Programming Guide”. September. 2015. url: http://docs.nvidia.com/cuda/cuda-c-programming-guide/ (visited on 07/25/2016).
[16] NVIDIA Corporation. “CUDA Toolkit”. 2016. url: https://developer.nvidia.com/cuda-toolkit (visited on 07/26/2016).
[17] NVIDIA Corporation. “NVIDIA’s Next Generation CUDA Compute Architecture: Fermi”. In: (2009). url: http://www.nvidia.com.tw/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf.
[18] R. A. X. Persson, N. K. Voulgarakis, and J.-W. Chu. “Dynamic mesoscale model of dipolar fluids via fluctuating hydrodynamics”. In: “Journal of Chemical Physics” 141.17 (2014).
[19] S. Popinet and S. Zaleski. “A front-tracking algorithm for accurate representation of surface tension”. In: “International Journal for Numerical Methods in Fluids” 30.6 (1999), pp. 775–793.
[20] J. Sanders and E. Kandor. “CUDA By Example”. Upper Saddle River, New Jersey: Addison-Wesley, 2010.
[21] N. K. Voulgarakis and J.-W. Chu. “Bridging fluctuating hydrodynamics and molecular dynamics simulations of fluids”. In: “Journal of Chemical Physics” 130.13 (2009).
[22] N. K. Voulgarakis, B. Z. Shang, and J.-W. Chu. “Linking hydrophobicity and hydrodynamics by the hybrid fluctuating hydrodynamics and molecular dynamics methodologies”. In: “Physical Review E - Statistical, Nonlinear, and Soft Matter Physics” 88.2 (2013).

電子全文

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	藉由全動態斷定執行減少難以預測的分支指令對於效能的影響
2.	在 FPGA 上實現線上遞迴式獨立成分分析算法
3.	低功耗高效能多核心視訊解碼器設計
4.	霍夫轉換與影像中規律排列圓形區域之偵測
5.	晚點較好：延遲和聚集固態硬碟TRIM指令與發送時機管理方法
6.	利用適應性的多個快取置換策略來提高一個基址暫存器的快取的命中率
7.	應用於微流體細胞篩檢系統之即時螢光訊號處理引擎設計與實現
8.	特殊應用計算加速器設計之研究
9.	用於HEVC可調視訊編碼中估測模式相依之像素權重畫面內預測演算法
10.	快取分區模式效能改進的方法
11.	利用運算重排發揮H.264中去方塊濾波器的平行度
12.	在發生分支預測錯誤時利用檢查呼叫與返回指令之歷史記錄以更正返回位址推疊之機制
13.	水文資訊監測系統的資料傳輸及控制平台改善
14.	細線化前之區塊深度值測試與其對系統設計之影響
15.	O2TC：利用二進制優點的高品質、低複雜度二進制材質壓縮法

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室