|
[1] Khronos Working Group. (2014) The OpenCL Specification Version: 2.0. [2] Khronos Working Group. (2014) The OpenCL C Specifications Version: 2.0. [3] Fabrice Bellard, "QEMU, a fast and portable dynamic translator," in USENIX Annual Technical Conference, 2005, pp. 41-46. [4] HSA foundation. (2015) HSA Platform System Architecture Specification 1.0. [5] HSA foundation. (2015) HSA Programmer Reference Manual Specification 1.0. [6] HSA foundation. (2015) HSA Runtime Specification 1.0. [7] Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R, Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, David A. Wood Nathan Binkert, "The gem5 simulator," ACM SIGARCH Computer Architecture News, pp. 1-7, 2011. [8] Tayler Hetherington, Ahmed ElTantawy, Syed Gilani, Nam Sung Kim, Tor M. Aamodt, Vijay Janapa Reddi Jingwen Leng, "GPUWattch: Enabling Energy Optimizations in GPGPUs," in International Symposium on Computer Architecture, 2013, pp. 487-498. [9] Emmett Witchel and Mendel Rosenblum, "Embra: fast and flexible machine simulation," Proceedings of ACM SIGMETRICS international conference on Measurement and Modeling of computer systems, pp. 68-79, 1996. [10] Tei-Wei Kuo, Chi-Sheng Shih, and Chia-Heng Tu Shih-Hao Hung, "System wide profiling and optimization with virtual machines," Asia and South Pacific Design Automation Conference, pp. 395-400, 2012. [11] Robert E. Lantz, "Fast Functional Simulation with Parallel Embra," in In proceeedings of Workshop on Modeling, Benchmarking and Simulation (MoBS), 2008. [12] Po-Chun Chang, Wei-Chung Hsu, and Yeh-Ching Chung Jiun-Hung Ding, "PQEMU: A Parallel System Emulator Based on QEMU," in IEEE 17th International Conference on Parallel and Distributed Systems, 2011, pp. 276-283. [13] Z. Wang et al., "COREMU: a scalable and portable parallel full-system emulator," in Principles and Practice of Parallel Programming, 2011, pp. 213-222. [14] George L. Yuan, Wilson W. L. Fung, Henry Wong and Tor M. Aamodt Ali Bakhoda, "Analyzing CUDA workloads using a detailed GPU simulator," Performance Analysis of Systems and Software, pp. 163-174, 2009. [15] Marc Daumas, David Defour and David Parellol Sylvain Collange, "Barra: a parallel functional simulator for GPGPU," IEEE international Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 251-360, 2010. [16] Joel Hestness, Marc S. Orr, Mark D. Hill, and David A. Wood Jason Power, "gem5-gpu: A Heterogeneous CPU-GPU Simulator," Computer Architecture Letters, p. 1, 2014. [17] Sahuquillo, S.Petit, and P.Lopez R. Ubal, "Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors," in Computer Architecture and High PerformanceComputing, 2007, pp. 62-68. [18] Andrew Robert Kerr, Sudhakar Yalamanchili and Nathan Clark Gregory Frederick Diamos, "Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems," in Proceedings of international conference on Parallel architectures and compilation techniques, 2010. [19] Bai-Cheng Jeng, Shih-Hai Hung, Wei-Chung Hsu, and Yeh-Ching Chung Jiun-Hung Ding, "HSAemu - A Full System Emulator for HSA Platforms," Proceedings of ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES + ISSS), p. Article 26, October 2014. [20] Kuo-Min Lin and Yeh-Ching Chung, "A Compilation Framework for HSA," , 2014. [21] Zhou-Dong Guo and Yeh-Ching Chung, "HSA emulator design based on QEMU," , 2013. [22] Chung-Min Kao and Yeh-Ching Chung, "The LLVM based GPU Compiler in Heterogeneous System Architecture Emulator: HTranslator," , 2013. [23] Jui Hsiao and Yeh-Ching Chung, "An OpenCL 2.0 Compilation Framework for HSA," , 2015. [24] Wei-Chih Sun and Yeh-Ching Chung, "An OpenCL 2.0 Runtime based on HSA Runtime," , 2015. [25] Che-Yang Kuo and Yeh-Ching Chung, "Implementation Of Image Feature Supports in HSAemu Framework," , 2015. [26] Bai-Cheng Jeng and Yeh-Ching Chung, "HSAemu Framework," , 2014. [27] Advanced Micro Devices. (2013) AMD OpenCL™ Accelerated Parallel Processing SDK. [Online]. http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/
[28] Robert Ioffe (intel) and Adam Lake (intel). (2015) The Generic Address Space in OpenCL™ 2.0. [Online]. https://software.intel.com/en-us/articles/the-generic-address-space-in-opencl-20
|