|
[1] “Multi-Core Processing with AMD,” http://www.amd.com/us/products/technologies/multi-core-processing/Pages/multi-core-processing.aspx. [2] “The Cell project at IBM Research,” http://www.research.ibm.com/cell/cell_chip.html. [3] “Intel Pentium Processor Extreme Edition,” http://www.intel.com/products/processor/pentium4htxe/index.htm. [4] “Ambric’s Massively Parallel Processor Array technology,” http://www.ambric.com/. [5] S. Bell, B. Edwards, J. Amann, R. Conlin, K. Joyce, V. Leung, J. MacKay, M. Reif, Liewei Bao, et al., “TILE64 - Processor: A 64-Core SoC with Mesh Interconnect,” in Solid-State Circuits Conference, 2008. Digest of Technical Papers. IEEE International, pp. 88–598. [6] T. Austin, E. Larson, and D. Ernst, “SimpleScalar: an infrastructure for computer system modeling,” Computer 35, 59–67 (2002). [7] P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner, “Simics: A full system simulation platform,” Computer 35, 50–58 (2002). [8] F. Bellard, “QEMU, a fast and portable dynamic translator,” in USENIX Annual Technical Conference, Berkeley, CA, USA (2005). [9] M. Monchiero, J. H. Ahn, A. Falcón, D. Ortega, and P. Faraboschi, “How to simulate 1000 cores,” ACM SIGARCH Computer Architecture News 37, 10 (2009). [10] J. E. Miller, H. Kasture, G. Kurian, C. Gruenwald, N. Beckmann, C. Celio, J. Eastep, and A. Agarwal, “Graphite: A distributed parallel simulator for multicores,” in 2010 IEEE 16th International Symposium on High Performance Computer Architecture, pp. 1–12. [11] A. Jaleel, R. S. Cohn, C.-K. Luk, and B. Jacob., “CMP$im: A Pin-based on-the-fly multi-core cache simulator,” presented at In Proceedings of the Fourth Annual Workshop on Modeling, Benchmarking and Simulation, 2008. [12] A. Srivastava and A. Eustace, “ATOM,” ACM SIGPLAN Notices 39, 528 (2004). [13] V. J. Reddi, A. Settle, D. A. Connors, and R. S. Cohn, “PIN: a binary instrumentation tool for computer architecture research and education,” in Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture, ACM, New York, NY, USA (2004). [14] S. S. Mukherjee, S. K. Reinhardt, B. Falsafi, M. Litzkow, M. D. Hill, D. A. Wood, S. Huss-Lederman, and J. R. Larus, “Wisconsin Wind Tunnel II: a fast, portable parallel architecturesimulator,” IEEE Concurrency 8, 12–20 (2000). [15] M.-H. Wu, C.-Y. Fu, P.-C. Wang, and R.-S. Tsay, “An effective synchronization approach for fast and accurate multi-core instruction-set simulation,” 2009, 197, ACM Press. [16] J. Chen, M. Annavaram, and M. Dubois, “SlackSim,” ACM SIGMETRICS Performance Evaluation Review 37, 77 (2009). [17] N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, and S. K. Reinhardt, “The M5 Simulator: Modeling Networked Systems,” IEEE Micro 26, 52–60 (2006). [18] C. J. Hughes, V. S. Pai, P. Ranganathan, and S. V. Adve, “Rsim: simulating shared-memory multiprocessors with ILP processors,” Computer 35, 40–49 (2002). [19] M. Rosenblum, S. A. Herrod, E. Witchel, and A. Gupta, “Complete computer system simulation: the SimOS approach,” IEEE Parallel & Distributed Technology: Systems & Applications 3, 34–43 (1995). [20] M. M. K. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood, “Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset,” ACM SIGARCH Computer Architecture News 33, 92 (2005). [21] K. P. Lawton, “Bochs: A Portable PC Emulator for Unix/X,” Linux J. 1996. [22] Jiun-Hung Ding, Po-Chun Chang, Wei-Chung Hsu, and Yeh-Ching Chung, “A Parallel Dynamic Binary Translation Design for Multi-core System Emulator,” presented at Workshop on Compiler Techniques for High-Performance and Embedded Computing, 2 June 2011, Taichung. [23] J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos., “SESC: SuperESCalar Simulator,” http://sourceforge.net/projects/sesc/. [24] G. Schirner, A. Gerstlauer, and R. Dömer, “Fast and accurate processor models for efficient MPSoC design,” ACM Transactions on Design Automation of Electronic Systems 15, 1–26 (2010). [25] J. Elder and M. Hill, “Dinero IV Trace-Driven Uniprocessor Cache Simulator” (2003). [26] R. Iyer, “On modeling and analyzing cache hierarchies using CASPER,” in 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003, pp. 182–187. [27] R. A. Uhlig and T. N. Mudge, “Trace-driven memory simulation: a survey,” ACM Computing Surveys 29, 128–170 (1997). [28] “AMD SimNowTM Simulator,” AMD Developer Central, http://developer.amd.com/tools/simnow/Pages/default.aspx. [29] C. J. Mauer, M. D. Hill, and D. A. Wood, “Full-system timing-first simulation,” 2002, 108, ACM Press. [30] K. R. Irvine, Assembly Language for x86 Processors, 6th ed., Prentice Hall (2010). [31] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 programs: characterization and methodological considerations,” in 22nd Annual International Symposium on Computer Architecture, 1995. Proceedings, pp. 24–36.
|