臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.81) 您好！臺灣時間：2025/10/06 05:40

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

楊智偉

研究生(外文):

Chih-Wei Young

論文名稱:

在GPU上使用CUDA處理稀疏建構函式

論文名稱(外文):

Sparse Construction Functions for GPU Processing with CUDA

指導教授:

張榮貴

指導教授(外文):

Rong-Guey Chang

口試委員:

張榮貴、陳鵬升、黃元欣、薛智文

口試委員(外文):

Rong-Guey Chang、Peng-Sheng Chen、Yuan-Shin Hwang、Chih-Wen Hsueh

口試日期:

2013-07-08

學位類別:

碩士

校院名稱:

國立中正大學

系所名稱:

資訊工程研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2013

畢業學年度:

101

語文別:

中文

論文頁數:

中文關鍵詞:

圖形處理器、稀疏建構函式、統一計算架構

外文關鍵詞:

GPU、CUDA、Fortran、sparse matrix、construction function

相關次數:

被引用:0
點閱:508
評分:
下載:15
書目收藏:0

NVIDIA推出CUDA(Compute Unified Device Architecture) ，CUDA是個平行運算架構，利用強大的GPU作資料平行運算，此架構大幅提升運算效能，採用比較容易掌握的類C語言開發，開發者必須了解GPU架構和平行演算法設計，發揮GPU的強大效能。

在實際應用上例如科學上的流體力學、氣象預測、分析地震、基因工程等，此類計算都是需要超大量運算能力的稀疏矩陣運算。此時利用GPU多核心的特性，來進行高效能的平行運算是非常適合的，然而目前在CUDA上並沒有具備如此的稀疏矩陣函式庫來輔助開發者撰寫平行程式以縮短開發應用程式的時程，於是如何在CUDA架構上研發出高效能、友善使用方式的稀疏矩陣函式庫，是一項相當實用的技術。

CUDA(Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA. The use of powerful GPU for data parallel computing. This architecture significantly improve computing performance, and the CUDA platform is accessible to software developers through extensions to industry-standard programming languages, including C. Developers must understand the GPU architecture and parallel algorithm design, in order to achieve outstanding performance of GPU.

Practical applications such as fluid dynamics , weather forecasting , seismic analysis , genetic engineering etc., such sparse matrix operation that needs to exceed a large amount of operation ability. The use of multi-core GPU features for high-performance parallel computing is very suitable, in addition, to provide the CUDA sparse library to assist developers to write parallel programs in order to shorten the time of developing applications, then how to research and develop with high performance and user friendly sparse library on CUDA ,is a practical technology.

1、Introduction 1

2、Related Work 4
2.1 Architecture of GPU 4
2.2 Sparse matrix 9
2.3 CUDA 12

3、Sparse Construction Functions 18
3.1 Pack 18
3.2 Unpack 19
3.3 Reshape 21
3.4 Spread 22
3.5 Merge 22
3.6 Application 25

4、Optimization 28
4.1 Substituting register for global memory 29
4.2 Storing data in Shared memory with block 31
4.3 Splitting the mask array 35

5、Experimental Result 37
5.1 Environment 37
5.2 Benchmark 38
5.3 Result 38
5.3.1 The result of pack 39
5.3.2 The result of unpack 41
5.3.3 The result of reshape 42
5.3.4 The result of spread 43
5.3.5 The result of merge 44

6、Conclusion 46

7、REFERENCES 47

[1]http://en.wikipedia.org/wiki/Graphics_processing_unit
[2] Chang, R. G., Chuang, T. R., & Lee, J. K. (1998, July). Efficient support of parallel sparse computation for array intrinsic functions of Fortran 90. In Proceedings of the 12th international conference on Supercomputing (pp. 45-52). ACM.
[3] Bell, N., & Garland, M. (2008). Efficient sparse matrix-vector multiplication on CUDA (Vol. 20). NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation.
[4] Hong, S., & Kim, H. (2009). An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. ACM SIGARCH Computer Architecture News, 37(3), 152-163.
[5] Wende, F., Cordes, F., & Steinke, T. (2012, July). On Improving the Performance of Multi-threaded CUDA Applications with Concurrent Kernel Execution by Kernel Reordering. In Application Accelerators in High Performance Computing (SAAHPC), 2012 Symposium on (pp. 74-83). IEEE.
[6] Guo, P., & Wang, L. (2012, July). Accurate CUDA performance modeling for sparse matrix-vector multiplication. In High Performance Computing and Simulation (HPCS), 2012 International Conference on (pp. 496-502). IEEE.
[7] Bauer, M., Cook, H., & Khailany, B. (2011, November). CudaDMA: optimizing GPU memory bandwidth via warp specialization. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (p. 12). ACM.
[8]Christen, M., Schenk, O., & Burkhart, H. (2007, October). General-purpose sparse matrix building blocks using the NVIDIA CUDA technology platform. In First Workshop on General Purpose Processing on Graphics Processing Units.
[9]Oberhuber, T., Suzuki, A., & Vacata, J. (2010). New row-grouped csr format for storing the sparse matrices on GPU with implementation in CUDA. arXiv preprint arXiv:1012.2270.
[10]Garland, M. (2008, June). Sparse matrix computations on manycore GPU's. In Proceedings of the 45th annual Design Automation Conference (pp. 2-6). ACM.
[11]NVIDIA,CUDA C Programming Guide , Version 4.2
[12] Xiao, S., & Feng, W. C. (2010, April). Inter-block GPU communication via fast barrier synchronization. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on (pp. 1-12). IEEE.
[13]NVIDIA CUDA GPU Computing Discussion Forum.
http://forums.nvidia.com/index.php?showtopic=104243.
[14]The Fortran 2003 Handbook
[15] NVIDIA’s Next Generation CUDA Compute Architecture: Kepler GK110
[16]Benchmark：The University of Florida Sparse Matrix Collection，http://www.cise.ufl.edu/research/sparse/matrices/index.html
[17]Davis, T. A. (2006). Direct methods for sparse linear systems. Siam
[18]http://en.wikipedia.org/wiki/System_of_linear_equations

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	平行處理資料降維技術於多核心圖形運算處理器
2.	利用GPU加速SIFT特徵之擷取與比對
3.	利用圖形處理器加速射線追蹤演算法應用於醫學成像之研究
4.	在GPU上實作平行處理Bzip2資料壓縮演算法
5.	在CUDA系統上兩階段任務排程方法
6.	運用平行化技術於一對一最短路徑問題之研究
7.	地震反應頻譜平行計算
8.	在GPU平台上以CSR稀疏矩陣轉置之平行演算法設計
9.	使用GPU加速哼唱選歌比對
10.	使用GPU提升事件相關電位之動態因果模型的運算效能
11.	在GPU上使用CUDA處理稀疏矩陣計算函式
12.	以GPU模擬台灣沿海海嘯傳播
13.	植基於圖形處理器的利潤偏好度演算法
14.	使用多子矩陣法結合中央處理器和圖形處理器解決大型稀疏線性系統
15.	感知無線電頻譜偵測技術之繪圖處理器實現

無相關期刊

1.	在GPU上使用CUDA處理稀疏矩陣計算函式
2.	類比電路運算放大器自動繞線
3.	相變記憶體省電優化
4.	基於點對點技術的電子商務平台
5.	邁向民族學校？我國曙光國民中小學設置現況、政策與紐西蘭民族學校辦學模式之比較。
6.	品牌與代工經營模式關係之探討－以一家醫療器材中小企業為例
7.	利用超音波霧化噴塗技術製備銫化鎢薄膜及粉體光學性質研究
8.	光發酵及暗發酵菌在共培養及固定化狀態下不同操作條件對其產氫效能之影響
9.	探討激發動機與抑制阻力對於網路遊戲成癮之影響
10.	適用於智慧影像系統應用之全景視訊記錄SoC設計
11.	安全管理系統、安全氣候與安全行為之研究
12.	全日型身心障礙福利機構教保員職能評選模式之建構
13.	晶界擴散法對燒結釹鐵硼磁石本質矯頑磁力提升之研究
14.	漢語計算語意學初探：以中央研究院平衡語料庫為本之研究
15.	文化差異對於不同心理健康狀態使用健康管理資訊系統之影響因素-以台灣及中國大陸為例

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室