|
[1] Yann LeCun, Yoshua Bengio & Geoffrey Hinton, “Review, Deep Learning” Nature, Vol 521, 28 May 2015, pp436-444 [2] Yu-Hsin Chen, Tushar Krishna, Joel S. Emer, Vivienne Sze ‘‘Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks’’, IEEE Journal of Solid-State Circuits, Vol. 52, No. 1, January 2017, p127-p138 [3] Yu-Hsin Chen, Joel Emer, Vivienne Sze, ‘‘Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks’’, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), p367-p379 [4] Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, Olivier Temam ‘‘ShiDianNao: Shifting Vision Processing Closer to the Sensor’’, Computer Architecture (ISCA), 2015 ACM/IEEE 42nd Annual International Symposium on, p92-p104 [5] Y.-H. Chen, T. Krishna, J. Emer, and V. Sze, “Eyeriss: An energyefficient reconfigurable accelerator for deep convolutional neural networks,” in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers (ISSCC), Jan./Feb. 2016, pp. 262–263 [6] Jaehyeong Sim; Jun-Seok Park; Minhye Kim; Dongmyung Bae; Yeongjae Choi; Lee-Sup Kim. “A 1.42TOPS/W Deep Convolutional Neural Network Recognition Processor for Intelligent IoE Systems”, International Solid-State Circuits Conference (ISSCC), 25 February 2016, pp264-265 [7] Duckhwan Kim, Jaeha Kung, Sek Chai, Sudhakar Yalamanchili, Saibal Mukhopadhyay, ‘‘Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory’’, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), p380-p392 [8] Hardik Sharma, Jongse Park, Divya Mahajan; Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, Hadi Esmaeilzadeh, ‘‘From High-Level Deep Neural Models to FPGAs’’, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), p1-p12 [9] Mohammad Motamedi, Philipp Gysel, Venkatesh Akella, Soheil Ghiasi ‘‘Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks’’, Design Automation Conference (ASP-DAC), 2016 21st Asia and South Pacific, p575-p580 [10] M. Peemen, A. A. A. Setio, B. Mesman, and H. Corporaal, “Memory-centric accelerator design for Convolutional Neural Networks,” in IEEE ICCD, 2013 [11] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in NIPS, 2012. [12] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” CoRR, vol. abs/1409.1556, 2014. [13] S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding,” in ICLR, 2016. [14] J. J. Tithi, N. C. Crago, and J. S. Emer, “Exploiting spatial architectures for edit distance algorithms,” IEEE ISPASS, 2014. [15] Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, Christos Kozyrakis, ‘‘TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory’’, ASPLOS, April 08 - 12, 2017, Xi’an, China. [16] Hybrid Memory Cube Consortium. Hybrid Memory Cube Specification 2.1, 2014. [17] J. Jeddeloh and B. Keeth. Hybrid Memory Cube New DRAM Architecture Increases Density and Performance. In 2012 Symposium on VLSI Technology (VLSIT), pages 87–88, 2012. [18] JEDEC Standard. High Bandwidth Memory (HBM) DRAM. JESD235A, 2015. [19] D. U. Lee, K. W. Kim, K. W. Kim, H. Kim, J. Y. Kim, Y. J. Park, J. H. Kim, D. S. Kim, H. B. Park, J. W. Shin, J. H. Cho, K. H. Kwon, M. J. Kim, J. Lee, K. W. Park, B. Chung, and S. Hong. 25.2 A 1.2V 8Gb 8-channel 128GB/s High-Bandwidth Memory (HBM) Stacked DRAM with Effective Microbump I/O Test Methods Using 29nm Process and TSV. In IEEE International Solid-State Circuits Conference (ISSCC), pages 432–433, 2014. [20] S. Li, K. Chen, J. H. Ahn, J. B. Brockman, and N. P. Jouppi. CACTI-P: Architecture-Level Modeling for SRAMbased Structures with Advanced Leakage Reduction Techniques. In 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 694–701, 2011. [21] Remi Yu, ‘‘Foundry TSV Enablement For 2.5D/3D Chip Stacking’’, UMC Hot Chips 24, August 27, 2012 [22] K. Yoon, G. Kim, W. Lee, T. Song, J. Lee, H. Lee, K. Park, J. Kim, “Modeling and analysis of coupling between TSVs, metal, and RDL interconnects in TSV-based 3D IC with silicon interposer,” IEEE Electronics Packaging Technology Conference, pp.702-706, 2009. [23] T. Sung, K. Chiang, D. Lee, and M. Ma, “Electrical analyses of TSVRDL- bump of interposers for high-speed 3D IC integration,” IEEE Electronic Components and Technology Conference (ECTC), pp.865-870, 2012. [24] Po-Tsang Huang, ‘‘2.5D/3D IC Design & System Integration Lecture 6 3D-Stacked Memory Subsystem’’, National Chiao Tung Univiersity. [25] K. N. Chen, and C. S. Tan, “Integration schemes and enabling technologies for three-dimensional integrated circuits,” IET Computers & Digital Techniques, vol. 5, no. 3, pp.160-168, May 2011. [26] M. Dreiza, A. Yoshida, K. Ishibashi and T. Maeda, “High Density PoP (Package-on-Package) and Package Stacking Development,” IEEE Electronic Components and Technology Conference (ECTC), pp.1397- 1402, 2007. [27] Chao Wang, Lei Gong, Qi Yu, Xi Li, Yuan Xie, Xuehai Zhou, ‘‘DLAU: A Scalable Deep Learning Accelerator Unit on FPGA’’, : IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume: 36, Issue: 3, pp. 513-517, March 2017 [28] Jiantao Qiu, Jie Wang, Song Yao, Kaiyuan Guo, Boxun Li,Erjin Zhou1, Jincheng Yu, Tianqi Tang, Ningyi Xu, Sen Song, Yu Wang, and Huazhong Yang, ‘‘Going Deeper with Embedded FPGA Platform for Convolutional Neural Network’’, FPGA’16, February 21-23, 2016, Monterey, CA, USA, pp.26-35 [29] Mitesh R. Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, Gabriel H. Loh, ‘‘Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories ’’, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), 7-11 Feb. 2015, pp126-136 [30] Dylan Stow, Itir Akgun, Russell Barnes, Peng Gu, Yuan Xie, ‘‘Cost analysis and cost-driven IP reuse methodology for SoC design based on 2.5D/3D integration’’ 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 7-10 Nov. 2016. [31] Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, Hadi Esmaeilzadeh, ‘‘From High-Level Deep Neural Models to FPGAs’’, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp1-12. [32] Yu-Hsuan Lin, Shih-Fan Peng, Wei Hwang, ‘‘Wide-I/O 3D-Stacked DRAM controller For Near-Data Processing System’’, 2017 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), pp1-4. [33] Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang and William J. Dally DeePhi Tech, ‘‘ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA’’, FPGA 17, February 22 - 24, 2017, Monterey, CA, USA, pp75-84. [34] https://www.xilinx.com/products/boards-and-kits/dk-v7-vc709-g.html#hardware [35] R. Tessier, K. Pocek, and A. DeHon, “Reconfigurable computing architectures,” Proc. of the IEEE, vol. 103, no. 3, pp. 332–354, 2015. [36] Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong. “Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks,” Proceedings of the 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2016), November 2016. [37] Hartej Singh, Ming-Hau Lee, Guangming Lu, Fadi J. Kurdahi, Nader Bagherzadeh and Eliseu M. C. Filho, “MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications,” Journal IEEE Transactions on Computers, Volume 49 Issue 5, Page 465-481, May 2000 [38] Dao-Ping Wang, Hon-Jarn Lin, Ching-Te Chuang, and Wei Hwang, “Low Power Multi-Port SRAM with Cross-Point Write World-Line, Shared Write Bit-Line and Shared Write Row- Access Transistors”, IEEE Transitions on Circuits and Systems II: Express Briefs, Vol. 61, No 3, pp. 182-192, March 2014. [39] Wei Hwang, “3D SiP: Prospects and Challenges” (invited paper), 3-D Architectures for Semiconductor Integration and Packaging Conference, Burlingame, CA, USA, December 13, 2011
|