(54.236.62.49) 您好!臺灣時間:2021/03/06 11:03
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:黃健宸
研究生(外文):HUANG, CHIEN-CHEN
論文名稱:權重共享深度神經網路加速器結合記憶體內運算之感測機制與感測放大器設計
論文名稱(外文):Design of Sensing Scheme and Sense Amplifier for CIM-Based Weight-Sharing DNN Accelerator
指導教授:王進賢
指導教授(外文):WANG, JINN-SHYAN
口試委員:王進賢林泰吉葉經緯黃俊銘
口試委員(外文):WANG, JINN-SHYANLIN, TAY-JYIYEH, CHING-WEIHUANG, CHUN-MING
口試日期:2020-07-28
學位類別:碩士
校院名稱:國立中正大學
系所名稱:電機工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:中文
論文頁數:60
中文關鍵詞:記憶體內運算深度神經網路權重共享感測機制感測放大器
外文關鍵詞:Computing-in-memoryDeep neural networkWeight-sharingSensing SchemeSense Amplifier
相關次數:
  • 被引用被引用:3
  • 點閱點閱:110
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
Computing-in-memory (CIM) 常使用於深度神經網路加速器中,用以增加乘加運算之能源效率。基於CIM-based Weight-Sharing技術所開發出的DNN加速器架構,能使整體DNN乘加運算之週期,能大幅地減少近1/16。在本研究中,主要設計CIM中的感測機制與感測放大器,實現讀取即加總之功能。
在感測機制與感測放大器之設計中:(1) 使用8T bitcell,避免write disturb與 RBL 放電非線性之問題;(2) 全數位式的感測機制,有別於傳統CIM,能降低資料輸入電路與類比電路所帶來的額外變異;(3) 在reference voltage generator 的設計上利用複製8T bitcell中2T讀取端產生參考電壓,並佈局於讀取端旁,降低電晶體間的local mismatch,減少變異;(4) 針對感測機制選用低功耗、高感測速度且不影響RBL電壓的Differential Current-Latch Sense Amplifier作為感測放大器;(5) 根據CIM的放電特性,解決SA之sensing dead zone且在Worst Case下的單次感測延遲時間能操作在0.925ns內,使CIM-based Weight-Sharing DNN能操作於100MHz。

Computing-in-memory (CIM) is used in deep neural network accelerator to improve the energy efficiency of multiply-and-accumulate (MAC) operations.
DNN accelerator architecture developed by CIM-based Weight-Sharing technology, the overall DNN MAC operations, which take up a lot of operation time, can be greatly reduced by nearly 1/16. In this research, the sensing scheme and sense amplifier in CIM are mainly designed to realize the function of summation.
The design of sensing scheme and sense amplifier: (1) Using 8T bitcell to avoid write disturb and non-linearity of RBL (2) The fully digital sensing scheme is different form prior CIM design, which can reduce the additional variation caused by the digital-to-analog converter and analog circuits; (3) In the design of the reference voltage generator, the reference voltage is generated by replica 2T read port of 8T bitcell, and placed next to the read port to reduce the variation; (4) For the sensing scheme, the Differential Current-Latch Sense Amplifier, which has low power consumption, high sensing speed and does not affect the input RBL signal, is selected as the sense amplifier; (5) According to the discharge characteristics of the CIM, solve the sensing dead zone in SA and the sensing delay time under worst case is able to operate in 0.925ns one time, so that the overall circuit operation speed can operate in 100 MHz.

摘要 i
ABSTRACT ii
目錄 iii
圖目錄 vi
表目錄 viii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 論文貢獻 2
1.3 論文架構 3
第二章 深度神經網路 4
2.1 傳統全連接式深度神經網路加速器 4
2.2 權重共享DNN加速器 6
2.2.1 權重向量量化 6
2.2.2 硬體架構 8
第三章 記憶體內運算 11
3.1 研究背景 11
3.2 電路設計 13
3.2.1 數位類比轉換器(DAC) 13
3.2.2 以脈衝(pulse)控制字元線(WL) 15
3.2.3 SAR ADC 之參考電壓產生器 16
第四章 CIM-based WSDNN之感測機制 17
4.1 設計考量 17
4.2 Multiple WL Access 17
4.2.1 Write Disturb [1] 17
4.2.2 RBL 放電特性[2] 21
4.3 感測機制之設計 22
4.3.1 放電路徑之複製電路 22
4.3.2 漸進逼近法 24
4.4 電路實作 25
4.4.1 Reference Code 控制器 25
4.4.2 Always-on sensing scheme 28
4.4.3 Dynamic sensing scheme 30
4.4.4 版本比較 31
第五章 CIM-based WSDNN之感測放大器 33
5.1 設計考量 33
5.2 傳統感測放大器設計 34
5.2.1 Conventional Current Mode Sense Amplifier [14] 34
5.2.2 Clamp Bit-Line Sense Amplifier [15] 34
5.2.3 Simple Four Transistor Sense Amplifier [16] 35
5.2.4 PMOS Bias Type Sense Amplifier [17] 36
5.2.5 Differential Latch Type Sense Amplifier [18] 37
5.2.6 比較與分析 38
5.3 栓鎖器式感測放大器應用於記憶體內運算 38
5.3.1 Sensing Dead Zone 39
5.3.2 改善SA電壓操作區間 40
5.3.3 Worst corner下的感測延遲 43
5.3.4 以Two-stage Sense Amplifier 改善感測延遲 43
5.4 電路佈局及蒙地卡羅分析 48
5.4.1 電路佈局 48
5.4.2 蒙地卡羅分析 52
第六章 電路模擬結果 53
第七章 結論及未來研究 56
7.1 結論 56
7.2 未來研究 57
參考文獻 58


[1]X. Si et al., "A Twin-8T SRAM Computation-in-Memory Unit-Macro for Multibit CNN-Based AI Edge Processors”, in IEEE Journal of Solid-State Circuits, vol. 55, no. 1, pp. 189-202, Jan. 2020
[2]Q. Dong et al., "15.3 A 351TOPS/W and 372.4GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications," IEEE International Solid- State Circuits Conference - (ISSCC), pp. 242-244, Feb. 2020
[3]S. K. Gonugondla, M. Kang and N. R. Shanbhag, "A Variation-Tolerant In-Memory Machine Learning Classifier via On-Chip Training," in IEEE Journal of Solid-State Circuits, vol. 53, no. 11, pp. 3163-3173, Nov. 2018
[4]W. Khwa et al., "A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors," 2018 IEEE International Solid - State Circuits Conference - (ISSCC), pp. 496-498, 2018
[5]P. N. Whatmough, S. K. Lee, H. Lee, S. Rama, D. Brooks and G. Wei, "14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications," IEEE International Solid-State Circuits Conference (ISSCC), pp. 242-243, 2017
[6]柯得生(2019)。用於深度神經網路之品質可調整權重共享方式。國立中正大學,嘉義縣。
[7]S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural network with pruning, trained quantiza-tion and huffman coding,” in Proc. CoRR, abs/1510.00149, 2, 2015.
[8]莊俊輝(2020)。向量量化深度神經網路加速器之設計及實現。國立中正大學,嘉義縣。
[9]徐紹軒(2020)。應用於權重共享深度神經網路加速器之可感知變異索引記憶體設計。國立中正大學,嘉義縣。
[10]洪士芳(2020)。應用於權重共享深度神經網路加速器之基於記憶體內運算可感知變異資料記憶體。國立中正大學,嘉義縣。
[11]曾士庭(2020)。低功耗權重共享深層神經網路加速器之整合與設計考量。國立中正大學,嘉義縣。
[12]X. Si et al., "Circuit Design Challenges in Computing-in-Memory for AI Edge Devices," IEEE 13th International Conference on ASIC (ASICON), pp. 1-4, 2019.
[13]A. Chrisanthopoulos et al., “Comparative study of different current mode sense amplifiers in submicron CMOS technology,” IEE Proceedings-Circuits Devices and Systems, vol. 149, no. 3, June, 2002
[14]T. Uetake, Y. Maki, T. Nakadai, K. Yoshida, M. Susuki and R. Nanjo, "A 1.0 ns access 770 MHz 36 Kb SRAM macro," 1999 Symposium on VLSI Circuits. Digest of Papers (IEEE Cat. No.99CH36326), Kyoto, Japan, 1999, pp. 109-110B. Wicht,
[15]T. N. Blalock and R. C. Jaeger, "A high-speed clamped bit-line current-mode sense amplifier," in IEEE Journal of Solid-State Circuits, vol. 26, no. 4, pp. 542-548, April 1991
[16]E. Seevinck, P. J. van Beers and H. Ontrop, "Current-mode techniques for high-speed VLSI circuits with application to current sense amplifier for CMOS SRAM's," in IEEE Journal of Solid-State Circuits, vol. 26, no. 4, pp. 525-536, April 1991
[17]K. Sasaki et al., "A 7-ns 140-mW 1-Mb CMOS SRAM with current sense amplifier," in IEEE Journal of Solid-State Circuits, vol. 27, no. 11, pp. 1511-1518, Nov. 1992
[18]T. Seki et al., "A 6-ns 1-Mb CMOS SRAM with latched sense amplifier," in IEEE Journal of Solid-State Circuits, vol. 28, no. 4, pp. 478-483, April 1993
[19]Taehui Na, Seung-Han Woo, Jisu Kim, Hanwool Jeong, Seong-Ook Jung, ” Comparative Study of Various Latch-Type Sense Amplifiers”, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 22, no. 2, pp. 425–429, Feb. 2013.
[20]B. Wicht, T. Nirschl and D. Schmitt-Landsiedel, "Yield and speed optimization of a latch-type voltage sense amplifier," in IEEE Journal of Solid-State Circuits, vol. 39, no. 7, pp. 1148-1158, July 2004

電子全文 電子全文(網際網路公開日期:20250827)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔