跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.17) 您好!臺灣時間:2026/06/15 10:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:胡雨霖
研究生(外文):Yu-LinHu
論文名稱:對於捲積神經網路通用型加速器研究與設計
論文名稱(外文):General Accelerator Study and Design for Convolutional Neural Network
指導教授:周哲民
指導教授(外文):Jer-Min Jou
學位類別:碩士
校院名稱:國立成功大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:44
中文關鍵詞:捲積神經網路系統設計FPGAASIC
外文關鍵詞:Convolutional Neural Networks (CNN)System DesignFPGAASIC
相關次數:
  • 被引用被引用:0
  • 點閱點閱:241
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
如何有效使用由科技發展所帶來的大數據,已然是現在研究的重要問題之一。其中捲積神經網路藉由其架構可拓展性、結構彈性與極低的錯誤率,成為目前研究的熱點之一。然而,最新的捲積神經網路就單次前向傳播的總體計算量往往超過數十億次,導致即便使用新型的高速通用處理器依然難以避免極長的計算延遲。雖然在桌面端已經出現使用顯示卡加速的解決方案,然而隨著嵌入式移動裝置的發展,對於在穿戴式裝置端具有快速處理捲積神經網路功能的加速器也越來越有其存在之必要。
因此,我們針對目前捲積神經網路的架構與其在嵌入式系統上的加速器實作進行分析,並使用捲積神經網路在參數重用性與迴圈可調換性作為硬體平行設計的依據。最終提出一個可以協助處理多種捲積神經網路架構的通用加速器。並在實驗中以晶片上系統的方式進行驗證。與其他嵌入式設計的裝置相比,我們的設計具有較高的硬體使用率與較小的面積。
The hardware design of Convolutional Neural Networks (CNN) facing the following problems: high complexity of computation, large amount of data movement and divergence to different neural network in structural domain. The previous work has dealt well with the first two problems but fail to take the third question in a wide consideration. After analyzing the state-to-art CNN accelerators and the design space they exploiting, we try to develop a format that can describe the full design space. Base on our design space exploration and hardware evaluation, we propose a novel general CNN hardware accelerator, which contain: hierarchical memory storage, variable length and width two-dimensional hardware processing unit set, and elastic data distributor. Our work shows higher multipliers usage in FPGA result compared with previous FPGA design. On the other hands, our work is as efficient as other two latest works in ASIC synthesis estimate.
摘要 III
General Accelerator Study and Design for Convolutional Neural Network IV
SUMMARY IV
INTRODUCTION IV
GENERAL CNN ACCELERATOR OVERVIEW VI
CONCLUSION IX
誌謝 X
目錄 XI
表目錄 XII
圖目錄 XIII
第一章 緒論 1
1.1研究背景 1
1.2研究動機與目的 2
1.3論文架構 3
第二章 背景知識與相關研究 4
2.1 捲積神經網路發展歷史 4
2.2 捲積神經網路之架構與特性 5
2.3 相關研究 10
第三章通用捲積型加速器之設計空間探討 13
3.1 捲積層平行度分析 13
3.2 硬體設計目標 19
3.3通用捲積加速器基礎計算架構與資料路徑 23
第四章通用捲積加速器架構 27
4.1通用捲積加速器概觀 27
4.2乘法單元集成 27
4.3加法單元集成 28
4.4控制單元架構 31
4.5記憶體單元架構 33
第五章 實驗環境與數據分析 36
5.1開發平台 36
5.2實驗方法與輸入輸出配置 37
5.3硬體合成結果 38
5.4實驗數據與結果分析 40
第六章 結論與未來展望 42
參考文獻 43
[1]A. Krizhevsky, I. Sutskever,G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks, the 25th International Conference on Neural Information Processing Systems, Volume 1, pp.1097-1105, Dec. 2012.
[2]R. M. French, “The Turing Test: The First Fifty Years, Trends in Cognitive Sciences, 4(3), pp. 115-121, 2000.
[3]D. H. Hubel, T. N. Wiesel, “Receptive Fields and Functional Architecture of Monkey Striate Cortex, J. Physiol. (1968), 195, pp. 215-243, 1968.
[4]K. Fukushima, “Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position, Biol. Cybernetics 36, pp. 193-202, 1980.
[5]Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-Based Learning Applied to Document Recognition, Proce. of the IEEE, vol. 86, iss. 11, Nov. 1998.
[6]G. E. Hinton, R. R. Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks, Science, vol. 313, iss. 5786, pp. 504-507, Jul. 2006.
[7]C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going deeper with convolutions, 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), June, 2015.
[8]R. K. Srivastava, K. Greff, J. Schmidhuber, “Training Very Deep Networks, Dec., 2015.
[9]K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition, Microsoft Research, Dec. 2015.
[10]Y. Ma, Y. Cao, S. Vrudhula, J. S. Seo, “Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks.
[11]Y. H. Chen, T. Krishna, Joel S. V. Sze, “Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks, IEEE Jour. of Solid-State Circuits, vol. 52, no. 1, pp. 127-138, Jan. 2017.
[12]X. Wei, C. H. Yu, P. Zhang, Y. Chen, Y. Wang, H. Hu, Y. Liang, J. Cong, “Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs, Falcon Comp. Solutions Inc., Jun. 2017.
[13]W. Lu, G. Yan, J. Li, S. Gong, Y. Han, X. Li, “FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks, IEEE Intern. Sympos. on High Perfor. Compu. Architecture, pp. 553-564, Feb. 2017.
[14]Y. J. Lin, T. S. Chang, “Data and Hardware Efficient Design for Convolutional Neural Network, IEEE Transactions on Circuits and Systems–I: Regular Papers, vol. 65, no. 5, MAY 2018.
[15]A. Azizimazreah, L. Chen, “Flexible On-chip Memory Architecture for DCNN Accelerators, AIM, USA, Sep., 2017.
[16]X. Yang, M. Gao, J. Pu, A. Nayak, Q. Liu, S. E. Bell, J. O. Setter, K. Caoy, H. Ha, C. Kozyrakis, M. Horowitz, “DNN Dataflow Choice Is Overrated, Sep. 2018.
[17]Terasic, “DE2i-150 Development Kit FPGA System User Manual, 2013.
[18]Altera, “Cyclone IV Device Handbook, Volume 1, Altera Corporation, March 2016.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊