|
[1] NVIDIA GPU. (2012). WHAT IS GPU ACCELERATED COMPUTING?[Online]. Available: http://www.nvidia.com/object/what-is-gpu-computing.html [2] Guillaume Colin de Verdière, "Introduction to GPGPU, a hardware and software background" , Comptes Rendus Mécanique, Volume 339, Issues 2–3, February–March 2011, Pages 78–89 [3] NVIDIA CUDA. (2012). CUDA Parallel Computing Platform[Online]. Available: http://www.nvidia.com/object/cuda_home_new.html [4] T. R. Halfhill, "Parallel processing with CUDA-NVIDA’s highperformance computing platform uses massive multithreading, " Microprocessor Rep., pp. 1–8, Jan. 2008. [5] NVIDIA. (2009). NVIDIA Cuda2.0 Programming Guide[Online]. Available: http://developer.download.nvidia.com/compute/cuda/2_0/docs/NVIDIA_CUDA_Programming_Guide_2.0.pdf [6] NVIDIA CUDA. (2014). CUDA C Programming Guide[Online]. Available: http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html [7] Michał Czapiński, Stuart Barnes, “Tabu Search with two approaches to parallel flowshop evaluation on CUDA platform”, J. Parallel Distrib. Comput. Vol 71, pp.802-811, 2011. [8] S Park, SY Shin, KB Hwang, CFMDS: “CUDA-based Fast Multidimensional Scaling for Genome-scale Data”, BMC bioinformatics, 2012 [9] CUDA platform”, J. Parallel Distrib. Comput. Vol 71, pp.802-811, 2011. [10] Chaofeng Hou, Ji Xu, Peng Wang, Wenlai Huang Xiaowei Wang, "Efficient GPU-accelerated molecular dynamics simulation of solid covalent crystals", Computer Physics Communications Volume 184, Issue 5, May 2013, pp. 1364–1371. [11] Udagawa, T, Sekijima, M, "Energy consumption of GPU with molecular dynamics", Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems, pp. 40-44. [12] Crespo, A.J.C , Dominguez, J.M, Valdez-Balderas, D, Rogers, B.D, Gomez-Gesteira, M, "Smoothed particle hydrodynamics on GPU computing", 2nd International Conference on Particle-Based Methods, PARTICLES 2011, pp. 922-929. [13] Rustico, E, Bilotta, G, Gallo, G, Hérault, A, Del Negro, C, "Smoothed particle hydrodynamics simulations on multi-GPU systems", Proceedings - 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Article number6169576, pp. 384-391. [14] Zhiyi Yang, Yating Zhu, Yong Pu, “Parallel Image Processing Based on CUDA”, Computer Science and Software Engineering, vol.3, pp.198~201, 2008. [15] Cuthill, Elizabeth, and James McKee. "Reducing the bandwidth of sparse symmetric matrices." Proceedings of the 1969 24th national conference. ACM, 1969. [16] A. George. Computer Implementation of the Finite Element Method. PhD thesis, 1971. [17] F. Checconi, F. Petrini, J. Willcock, A. Lumsdaine, A. R. Choudhury, and Y. Sabharwal, “Breaking the speed and scalability barriers for graph exploration on distributed-memory machines,” in International Conference for High Performance Computing, Networking, Storage and Analysis (SC’12). IEEE, 2012, pp. 1–12. [18] A. Buluc¸ and K. Madduri, “Parallel breadth-first search on distributed memory systems,” in International Conference for High Performance Computing, Networking, Storage and Analysis (SC’11). ACM, 2011, pp. 65:1–65:12. [19] Xu, Shiming, Wei Xue, and Hai Xiang Lin. "Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform." The Journal of Supercomputing (2013): 1-12. [20] Bell, Nathan, and Michael Garland. "Implementing sparse matrix-vector multiplication on throughput-oriented processors." Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. ACM, 2009. [21] Neelima, B., G. Ram Mohana Reddy, and Prakash S. Raghavendra. "Predicting an optimal sparse matrix format for SpMV computation on GPU." Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International. IEEE, 2014. [22] Monakov, Alexander, Anton Lokhmotov, and Arutyun Avetisyan. "Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures." HiPEAC 5952 (2010): 111-125. [23] Vazquez, Francisco, et al. "Improving the performance of the sparse matrix vector product with GPUs." Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on. IEEE, 2010. [24] Karakasis, Vasileios, et al. "An extended compression format for the optimization of sparse matrix-vector multiplication." IEEE Transactions on Parallel and Distributed Systems 24.10 (2013): 1930-1940. [25] Matam, Kiran Kumar, and Kishore Kothapalli. "Accelerating sparse matrix vector multiplica-tion in iterative methods using GPU." Parallel Processing (ICPP), 2011 International Confer-ence on. IEEE, 2011. [26] Saad, Youcef. "SPARSKIT: A basic tool kit for sparse matrix computations." (1990).
|