|
[1] A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Strachan, M. Hu, R. S. Williams, and V. Srikumar, “Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 14–26, 2016. [2] B. Li, Y. Wang, and Y. Chen, “Hitm: High-throughput reram-based pim for multi- modal neural networks,” in Proceedings of the 39th International Conference on Computer-Aided Design, pp. 1–7, 2020. [3] W.-T. Lin, H.-Y. Cheng, C.-L. Yang, M.-Y. Lin, K. Lien, H.-W. Hu, H.-S. Chang, H.-P. Li, M.-F. Chang, Y.-T. Tsou, et al., “Dl-rsim: A reliability and deployment strategy simulation framework for reram-based cnn accelerators,” ACM Transactions on Embedded Computing Systems (TECS), vol. 21, no. 3, pp. 1–29, 2022. [4] Y.-T. Tsou, K.-H. Chen, C.-L. Yang, H.-Y. Cheng, J.-J. Chen, and D.-Y. Tsai, “This is spatem! a spatial-temporal optimization framework for efficient inference on reram- based cnn accelerator,” in 2022 27th Asia and South Pacific Design Automation Con- ference (ASP-DAC), pp. 702–707, IEEE, 2022. [5] Y.-L. Zheng, W.-Y. Yang, Y.-S. Chen, and D.-H. Han, “An energy-efficient inference engine for a configurable reram-based neural network accelerator,” IEEE Transac- tions on Computer-Aided Design of Integrated Circuits and Systems, 2022. [6] T.-H. Yang, H.-Y. Cheng, C.-L. Yang, I.-C. Tseng, H.-W. Hu, H.-S. Chang, and H.- P. Li, “Sparse reram engine: Joint exploration of activation and weight sparsity in compressed neural networks,” in Proceedings of the 46th International Symposium on Computer Architecture, pp. 236–249, 2019. [7] M. Saberi, R. Lotfi, K. Mafinezhad, and W. A. Serdijn, “Analysis of power consump- tion and linearity in capacitive digital-to-analog converters used in successive approx- imation adcs,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, no. 8, pp. 1736–1748, 2011. [8] A. Azamat, F. Asim, and J. Lee, “Quarry: Quantization-based adc reduction for reram-based deep neural network accelerators,” in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), pp. 1–7, IEEE, 2021. [9] W. Fan, Y. Li, L. Du, L. Li, and Y. Du, “A 3-8bit reconfigurable hybrid adc architecture with successive-approximation and single-slope stages for computing in memory,” in 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 3393– 3397, IEEE, 2022. [10] Y. Hu, L. Hu, B. Tang, B. Li, Z. Wu, and X. Liu, “A 100 ks/s 8–10-bit resolution- reconfigurable sar adc for biosensor applications,” Micromachines, vol. 13, no. 11, p. 1909, 2022. [11] N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness,D. R. Hower, T. Krishna, S. Sardashti, et al., “The gem5 simulator,” ACM SIGARCH computer architecture news, vol. 39, no. 2, pp. 1–7, 2011. [12] S. Xu, X. Chen, Y. Wang, Y. Han, X. Qian, and X. Li, “Pimsim: A flexible and detailed processing-in-memory simulator,” IEEE Computer Architecture Letters, vol. 18, no. 1, pp. 6–9, 2018. [13] M. Poremba and Y. Xie, “Nvmain: An architectural-level main memory simulator for emerging non-volatile memories,” in IEEE Computer Society Annual Symposium on VLSI, pp. 392–397, IEEE, 2012. [14] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012. [15] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [16] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Van- houcke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015. [17] P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, “Prime: A novel processing-in-memory architecture for neural network computation in reram- based main memory,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 27–39, 2016. [18] L. Song, X. Qian, H. Li, and Y. Chen, “Pipelayer: A pipelined reram-based acceler- ator for deep learning,” in 2017 IEEE international symposium on high performance computer architecture (HPCA), pp. 541–552, IEEE, 2017. [19] T. Chou, W. Tang, J. Botimer, and Z. Zhang, “Cascade: Connecting rrams to extend analog dataflow in an end-to-end in-memory processing paradigm,” in Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 114– 125, 2019. [20] Q. Zheng, Z. Wang, Z. Feng, B. Yan, Y. Cai, R. Huang, Y. Chen, C.-L. Yang, and H. H. Li, “Lattice: An adc/dac-less reram-based processing-in-memory architecture for accelerating deep convolution neural networks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC), pp. 1–6, IEEE, 2020. [21] W. Li, P. Xu, Y. Zhao, H. Li, Y. Xie, and Y. Lin, “Timely: Pushing data move- ments and interfaces in pim accelerators towards local and in time domain,” in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pp. 832–845, IEEE, 2020. [22] Y. Zhao, Z. He, N. Jing, X. Liang, and L. Jiang, “Re2pim: A reconfigurable reram- based pim design for variable-sized vector-matrix multiplication,” in Proceedings of the 2021 on Great Lakes Symposium on VLSI, pp. 15–20, 2021. [23] W.-H. Chen, K.-X. Li, W.-Y. Lin, K.-H. Hsu, P.-Y. Li, C.-H. Yang, C.-X. Xue, E.-Y. Yang, Y.-K. Chen, Y.-S. Chang, et al., “A 65nm 1mb nonvolatile computing-in- memory reram macro with sub-16ns multiply-and-accumulate for binary dnn ai edge processors,” in 2018 IEEE International Solid-State Circuits Conference-(ISSCC), pp. 494–496, IEEE, 2018. [24] C.-Y. Tsai, C.-F. Nien, T.-C. Yu, H.-Y. Yeh, and H.-Y. Cheng, “Repim: Joint exploita- tion of activation and weight repetitions for in-reram dnn acceleration,” in 2021 58th ACM/IEEE Design Automation Conference (DAC), pp. 589–594, IEEE, 2021. [25] Y. Zhang, Z. Jia, H. Du, R. Xue, Z. Shen, and Z. Shao, “A practical highly paralleled reram-based dnn accelerator by reusing weight pattern repetitions,” IEEE Transac- tions on Computer-Aided Design of Integrated Circuits and Systems, vol. 41, no. 4, pp. 922–935, 2021. [26] Z. Xu, D. Yang, C. Yin, J. Tang, Y. Wang, and G. Xue, “A co-scheduling framework for dnn models on mobile and edge devices with heterogeneous hardware,” IEEE Transactions on Mobile Computing, 2021. [27] D. Xu, M. Xu, Q. Wang, S. Wang, Y. Ma, K. Huang, G. Huang, X. Jin, and X. Liu, “Mandheling: Mixed-precision on-device dnn training with dsp offloading,” in Pro- ceedings of the 28th Annual International Conference on Mobile Computing And Net- working, pp. 214–227, 2022. [28] W.-Y. Yang, Y.-S. Chen, and J.-W. Xiao, “A lazy engine for high-utilization and energy-efficient reram-based neural network accelerator,” in 2022 IEEE 20th Inter- national Conference on Industrial Informatics (INDIN), pp. 140–145, IEEE, 2022. [29] E. Baek, D. Kwon, and J. Kim, “A multi-neural network acceleration architecture,” in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pp. 940–953, IEEE, 2020. [30] C. Li, X. Fan, X. Wu, Z. Yang, M. Wang, M. Zhang, and S. Zhang, “Memory- computing decoupling: A dnn multitasking accelerator with adaptive data arrange- ment,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Sys- tems, vol. 41, no. 11, pp. 4112–4123, 2022.
|