|
[1] L. Baraz et al., "IA-32 Execution Layer: A Two-Phase Dynamic Translator Designed to Support IA-32 Applications on Itanium-Based Systems," Proc. 36th Ann. IEEE/ACM Int’l Symp. Microarchitecture, IEEE CS Press, 2003, pp. 191-204.
[2] Alexander Klaiber, "The Technology Behind the Crusoe Processors," White Paper, http://www.transmeta.com/pdf/white_papers/paper_aklaiber_19jan00.pdf, Jan. 2000.
[3] James C. Dehnert, et al., "The Transmeta Code Morphing Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges," Proceedings of the First Annual IEEE/ACM International Symposium on Code Generation and Optimization, March 2003, pp.15-24
[4] Kemal Ebcioglu, Erik R. Altman, Michael Gschwind, and Sumedh Sathaye, "Dynamic Binary Translation and Optimization," IEEE Trans. on Computers 50 (6), June 2001, pp. 529-548.
[5] Kemal Ebcioglu and Erik R. Altman, "DAISY: Dynamic Compilation for 100% Architectural Compatibility," Proc. of the 24th Annual Int’l Symp. on Computer Architecture, June 1997, pp. 26-37
[6] F. Bellard. "QEMU, a Fast and Portable Dynamic Translator," Proceedings of the USENIX Annual Technical Conference, FREENIX Track, pages 41–46, 2005.
[7] VMware, http://www.vmware.com/.
[8] E. R. Altman, D. Kaeli, and Y. Sheffer, "Welcome to the Opportunities of Binary Translation," IEEE Computer, March 2000, pp. 40-45.
[9] Java Virtual Machine, http://java.sun.com/.
[10] Khronos Open GL, http://www.khronos.org/opengl/.
[11] Microsoft Directx 3D, http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=2858. [12] Khronos Open GL|ES, http://www.khronos.org/opengles/.
[13] Windows Advanced Rasterization Platform (WARP), http://msdn.microsoft.com/en-us/library/gg615082.aspx.
[14] J. Nickolls, I. Buck, K. Skadron, and M. Garland, "Scalable Parallel Programming with CUDA," ACM Queue, vol. 6, no. 2, Mar./Apr. 2008, pp. 40-53.
[15] E. Lindholm, J. Nickolls, S. Oberman, J. Montrym, "NVIDIA Tesla: A uni_ed graphics and computing architecture,” IEEE Micro 28 (2) (2008) 39.55.
[16] Jae-Sung Yoon, Chang-Hyo Yu, Donghyun Kim, Lee-Sup Kim, "A Dual-Shader 3-D Graphics Processor With Fast 4-D Vector Inner Product Units and Power-Aware Texture Cache," IEEE Trans. on VLSI Volume: 19, Issue:4, April 2011, pp. 525-537
[17] Yovits, Marshall C, “Advances in computers,” Academic Press. pp. 105–107.
[18] Liptak, Béla G. (2006), "Instrument Engineers' Handbook: Process control and optimization," 2, CRC Press, pp. 11–12.
[19] R. Bhargava et al, "EvaluatingMMX Technology Using DSPand Multimedia Applications," In MICRO-31, Dec 1998.
[20] Texas Instrument( TI ) DSP information, http://focus.ti.com/dsp/docs/dsphome.tsp?sectionId=46&DCMP=TIHeaderTracking&HQS=Other+OT+hdr_p_dsp.
[21] Chang, C.-W., Lin, T.-J., Wu, C.-J., Lee, J.-K., Chu, Y.-H., & Wu, A.-Y. "Parallel Architecture Core (PAC)—the first MulticoreApplication processor SoC in Taiwan part I: Hardware architecture & software development tools," Journal of Signal Processing Systems, Springer Science+Business Media, LLC Mar. 2010.
[22] Yung-Chia Lin, Chung-Lin Tang, Chung-Ju Wu, Jenq-Kuen Lee, "Compiler Supports and Optimizations for PAC VLIW DSP Processors," Languages and Compilers for Parallel Computing, 2005.
[23] Wu, C. J., Chen, S. Y., & Lee, J. K., "Copy propagation optimizations for VLIW DSP processors with distributed register files," Languages and Compilers for Parallel Computing (LNCS 4382), pp.251–266, Jun. 2007.
[24] Yung-Chia Lin, Yi-Ping You, Jenq-Kuen Lee, "Register Allocation for VLIW DSP Processors with Irregular Register Files," Proceedings of the 12th Workshop on Compilers for Parallel Computers (CPC 2006), Jan 9–11 2006.
[25] Lu, C. H., Lin, Y. J., You, Y. P., & Lee, J. K. (2009). LC-GRFA:global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files. Concurrency and Computation: Practice and Experience, 21, 101–114.
|