跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.109) 您好!臺灣時間:2026/04/20 02:17
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳奕愷
研究生(外文):Yi-Kai Chen
論文名稱:節能與可重組化深度卷積神經網路架構設計
論文名稱(外文):Architecture Design of Energy-Efficient Reconfigurable Deep Convolutional Neural Network Accelerator
指導教授:陳良基陳良基引用關係
口試委員:蔡宗漢劉宗德楊佳玲
口試日期:2017-11-22
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電子工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:英文
論文頁數:57
中文關鍵詞:深度學習深度卷積神經網路資料重複使用設計高幀率架構設計晶片設計
相關次數:
  • 被引用被引用:0
  • 點閱點閱:230
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
深度卷積神經網路的相關研究已經進行多年,並且在電腦視覺的領域得到驚人的成果,相關的應用也使我們的生活更加便利和直覺,然而先進的深度卷積神經網路通常包含百萬等級的參數和十億等級的算術運算,這使深度卷積神經網路無法有效率的在行動裝置和嵌入式系統中使用。

在這篇論文中,我們呈現一個深度卷積神經網路的特殊應用積體電路(ASIC)架構設計,目標是需要即時計算和低功率消耗的應用,在深度卷積神經網路的架構設計中有兩個主要的挑戰,第一個是在運算過程中需要大量的記憶體存取,第二個是由於資料分布和精準度不同的特性,會造成運算中產生不必要的功率消耗,因此降低記憶體的讀取和有效率的算術運算是架構設計的重點。

我們透過系統化的方法分析深度卷積神經網路中每一層所適合的資料重複使用型式,並且使用最佳的重複利用資料型式來降低記憶體存取的次數,我們也提出一個基於有符號數處理(sign and magnitude)的乘積累加運算(MAC)實作方法,並且驗證這種設計方法相比於傳統二補數的設計方法可以達到更低的功率消耗。
Abstract ix
1 Introduction 1
1.1 The Applications of Deep Convolutional Neural Networks . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 5
2.1 Machine Learning Overview . . . . . . . . . . . . . . . . . . 5
2.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Modeling a neural . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Backpropagation Algorithm . . . . . . . . . . . . . . 9
2.2.3 Deep Learning . . . . . . . . . . . . . . . . . . . . . . 9
3 Convolutional Neural Network 13
3.1 Overview and Core Concepts . . . . . . . . . . . . . . . . . . 13
3.2 Important Features . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Building Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Popular Convolutional Neural Network Model . . . . . . . . 19
4 Architecture Design and Implementation of CNN Acceler-
ator 23
4.1 Design Challenges and Considerations . . . . . . . . . . . . . 23
4.2 Proposed Hardware Architecture . . . . . . . . . . . . . . . 24
4.2.1 Reduce DRAM Access . . . . . . . . . . . . . . . . . 27
4.2.2 Filter Adapted Data
ow . . . . . . . . . . . . . . . . 35
4.2.3 Energy Efficient Multiplier-Accumulator . . . . . . . 42
4.2.4 Architecture and Design Features . . . . . . . . . . . 43
4.2.5 Synthesis Results and Comparosion . . . . . . . . . . 46
5 Conclusion 49
Bibliography 50
[1] Russakovsky, Olga, et al. "Imagenet large scale visual recognition challenge." International Journal of Computer Vision 115.3 (2015): 211-252.
[2] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[3] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[4] Lai, Liangzhen, Naveen Suda, and Vikas Chandra. "Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations." arXiv preprint arXiv:1703.03073 (2017).
[5] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, C. Hill, and A. Arbor, Going Deeper with Convolutions," pp. 1-9, 2014.
[6] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[7] Lawrence, Steve, et al. "Face recognition: A convolutional neural-network approach." IEEE transactions on neural networks 8.1 (1997): 98-113.
[8] Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[9] Abdel-Hamid, Ossama, et al. "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition." Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012.
[10] Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).
[11] Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. "A convolutional neural network for modelling sentences." arXiv preprint arXiv:1404.2188 (2014).
[12] Hu, Baotian, et al. "Convolutional neural network architectures for matching natural language sentences." Advances in neural information processing systems. 2014.
[13] Dong, Chao, et al. "Learning a deep convolutional network for image super-resolution." European Conference on Computer Vision. Springer, Cham, 2014.
[14] Dong, Chao, Chen Change Loy, and Xiaoou Tang. "Accelerating the super-resolution convolutional neural network." European Conference on Computer Vision. Springer, Cham, 2016.
[15] Shi, Wenzhe, et al. "Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[16] Le, Quoc V. "Building high-level features using large scale unsupervised learning." Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013.
[17] Coates, Adam, et al. "Deep learning with COTS HPC systems." International Conference on Machine Learning. 2013.
[18] Jia, Yangqing, et al. "Caffe: Convolutional architecture for fast feature embedding." Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014.
[19] Qiu, Jiantao, et al. "Going deeper with embedded fpga platform for convolutional neural network." Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2016.
[20] Cavigelli, Lukas, and Luca Benini. "Origami: A 803-gop/s/w convolutional network accelerator." IEEE Transactions on Circuits and Systems for Video Technology 27.11 (2017): 2461-2475.
[21] Gysel, Philipp, Mohammad Motamedi, and Soheil Ghiasi. "Hardware-oriented approximation of convolutional neural networks." arXiv preprint arXiv:1604.03168 (2016).
[22] Samuel, Arthur L. "Some studies in machine learning using the game of checkers." IBM Journal of research and development 3.3 (1959): 210-229.
[23] Lowe, David G. "Distinctive image features from scale-invariant keypoints." International journal of computer vision 60.2 (2004): 91-110.
[24] Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.
[25] Chen, Yu-Hsin, et al. "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks." IEEE Journal of Solid-State Circuits 52.1 (2017): 127-138.
[26] Judd, Patrick, et al. "Reduced-precision strategies for bounded memory in deep neural nets." arXiv preprint arXiv:1511.05236 (2015).
[27] Chen, Yunji, et al. "Dadiannao: A machine-learning supercomputer." Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014.
[28] Chen, Tianshi, et al. "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning." ACM Sigplan Notices 49.4 (2014): 269-284.
[29] Jouppi, Norman P., et al. "In-datacenter performance analysis of a tensor processing unit." Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, 2017.
[30] Du, Zidong, et al. "ShiDianNao: Shifting vision processing closer to the sensor." ACM SIGARCH Computer Architecture News. Vol. 43. No. 3. ACM, 2015.
[31] Zhang, Chen, et al. "Optimizing fpga-based accelerator design for deep convolutional neural networks." Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2015.
[32] Li, Huimin, et al. "A high performance FPGA-based accelerator for large-scale convolutional neural networks." Field Programmable Logic and Applications (FPL), 2016 26th International Conference on. IEEE, 2016.
[33] Sim, Jaehyeong, et al. "14.6 a 1.42 tops/w deep convolutional neural network recognition processor for intelligent ioe systems." Solid-State Circuits Conference (ISSCC), 2016 IEEE International. IEEE, 2016.
[34] Moons, Bert, and Marian Verhelst. "A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets." VLSI Circuits (VLSI-Circuits), 2016 IEEE Symposium on. IEEE, 2016.
[35] Moons, Bert, et al. "14.5 envision: A 0.26-to-10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm fdsoi." Solid-State Circuits Conference (ISSCC), 2017 IEEE International. IEEE, 2017.
[36] Anwar, Sajid, Kyuyeon Hwang, and Wonyong Sung. "Fixed point optimization of deep convolutional neural networks for object recognition." Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015.
[37] Gupta, Suyog, et al. "Deep learning with limited numerical precision." International Conference on Machine Learning. 2015.
[38] Li, Fengfu, Bo Zhang, and Bin Liu. "Ternary weight networks." arXiv preprint arXiv:1605.04711 (2016).
[39] Rastegari, Mohammad, et al. "Xnor-net: Imagenet classification using binary convolutional neural networks." European Conference on Computer Vision. Springer, Cham, 2016.
[40] Courbariaux, Matthieu, et al. "Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1." arXiv preprint arXiv:1602.02830 (2016).
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊