|
[1] Pluto: A polyhedral automatic parallelizer and locality optimizer for multicores. Available at http://pluto-compiler.sourceforge.net. [2] CLooG: The Chunky Loop Generator. http://www.cloog.org. [3] C′edric Bastoul. Code generation in the polyhedral model is easier than you think. In IEEE PACT, pages 7–16, September 2004. [4] PrimeTile: A Parametric Multi-Level Tiler for Imperfect Loop Nests. http://primetile.sourceforge.net. [5] S. Coleman and K. McKinley. Tile Size Selection Using Cache Organization and Data Layout. In PLDI’95, pages 279–290, 1995. [6] HiTLoG: Hierarchical Tiled Loop Generator. Available at http://www.cs.colostate.edu/MMAlpha/tiling/. [7] D. Kim and S. Rajopadhye. Parameterized tiling for imperfectly nested loops. Technical Report CS-09-101, Colorado State University, Department of Computer Science, February 2009. [8] TLoG: A Parametrized Tiled Loop Generator. Available at http://www.cs.colostate.edu/MMAlpha/tiling/. [9] U. Bondhugula, J. Ramanujam, and P. Sadayappan. Pluto: A practical and fully automatic polyhedral parallelizer and locality optimizer. Technical Report OSU-CISRC-10/07-TR70, The Ohio State University, Oct. 2007. [10] Jingling Xue. Loop tiling for parallelism. Kluwer Academic Publishers, Norwell, MA, USA, 2000. [11] J. M. Bull , M. E. Kambites, JOMP—an OpenMP-like interface for Java, Proceedings of the ACM 2000 conference on Java Grande, p.44-53, San Francisco, California, United States , June 03-04, 2000. [12] Michael Klemm, Matthias Bezold, Ronald Veldema, and Michael Philippsen. Jamp: An implementation of openmp for a java dsm. Concurrency and Computation: Practice and Experience, 18(19):2333{2352, 2007. [13] JavaCC : A Java Compiler Compiler. http://javacc.java.net/ [14] DaeGon Kim , Lakshminarayanan Renganarayanan , Dave Rostron , Sanjay Rajopadhye , Michelle Mills Strout, Multi-level tiling: M for the price of one, Proceedings of the 2007 ACM/IEEE conference on Supercomputing, Reno, Nevada , November 10-16, 2007. [15] M. Baskaran, A. Hartono, S. Tavarageri, T. Henretty, J. Ramanujam, and P. Sadayappan. Parameterized tiling revisited. In The International Symposium on Code Generation and Optimization (CGO), 2010. [16] M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. Submitted for publication., 1990. [17] Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, , Santa Clara, California, United States, April 08-11, 1991. [18] Stephanie Coleman , Kathryn S. McKinley, Tile size selection using cache organization and data layout, Proceedings of the ACM SIGPLAN 1995. conference on Programming language design and implementation, p.279-290, La Jolla, California, United States June 18-21, 1995. [19] Gabriel Rivera , Chau-Wen Tseng, Locality optimizations for multi-level caches, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.2-es, Portland, Oregon, United States, November 14-19, 1999. [20] T. Yuki, L. Renganarayanan, Sanjay Rajopadhye, Charles Anderson, Alexandre E. Eichenberger, Kevin O'Brien, Automatic creation of tile size selection models, Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization , NY, USA, 2010. [21] OpenMP : http://openmp.org/wp/ [22] MPI : Message Passing Interface. http://www.mcs.anl.gov/research/projects/mpi/ [23] Pthread : http://computing.llnl.gov/tutorials/pthreads/ [24] Wei-I Lu, Improving Multi-core Cache Utilization with Data Blocking and Thread Grouping, June, 2010 [25] Wei-Nung Su, JMC : Java Language Compiler and Runtime Library for Multi-Core Platform Based-On OpenMP 3.0 Programming Model, June, 2010
|