跳到主要內容

臺灣博碩士論文加值系統

(100.28.0.143) 您好!臺灣時間:2024/07/19 17:13
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王聖融
研究生(外文):WANG,SHENG-RONG
論文名稱:移植ONNC在Cortex-M處理器上及其優化
論文名稱(外文):Porting ONNC and Optimization for Cortex-M Processors
指導教授:張榮貴張榮貴引用關係
指導教授(外文):CHANG,RONG-GUEY
口試委員:陳敬張榮貴陳鵬升簡廷軒
口試委員(外文):CHEN JINGCHANG,RONG-GUEYCHEN, PENG-SHENGJIAN,TING-XUAN
口試日期:2020-07-14
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:中文
論文頁數:42
中文關鍵詞:深度學習神經網路編譯器(ONNC)ONNXCortex-MCMSIS
外文關鍵詞:Deep learningONNCONNXCortex-MCMSIS
相關次數:
  • 被引用被引用:1
  • 點閱點閱:348
  • 評分評分:
  • 下載下載:9
  • 收藏至我的研究室書目清單書目收藏:0
近年來隨著深度學習的應用越來越多,將會有許多開發商開發許多應用晶片或深度學習加速器,讓深度學習的模型能夠佈署在上面,由於深度學習的訓練框架有非常多種,每個框架對於儲存模型的方式、格式也不同,這樣對於想將不同訓練框架訓練出來的模型移植到同硬體上需要額外針對不同格式去作修改,這是一件費力的事,此外要在嵌入式系統的有限資源中執行深度學習的演算法也是非常有難度的,包括推導時的實時性、嵌入式裝置本身的記憶體限制,這些都是將深度學習佈署在嵌入式系統上所要面對的重要課題。

本篇論文將以手寫辨識為例子,透過開放神經網路交換格式ONNX(Open Neural Network Exchange)儲存模型架構,並使用ONNC(Open Neural Network Compiler)將模型中的相關資訊轉換為CMSIS-NN library中函式的參數,並藉由所提供的函式去完成ONNX中神經網路的架構,以此將深度學習之模型部屬在Cortex-M上,並針對原有的ONNC所支援的運算元進行拓展、測試及新增pass提醒使用者使用的注意事項。另外為了因應嵌入式系統實時性的要求及減少深度學習在嵌入式系統上所佔用的記憶體使用量,本篇論文將針對模型使用分離式卷積去減少推導時的運算量,且使用最大池化進行向下採樣,最後與MoblieNet中所提出的模型架構進行記憶體使用、準確率的比較,取得記憶體使用和準確率的平衡並減少推導時間。

A lot of deep learning applications have developed in recent years, many developers will develop many chips or deep learning accelerators to allow deep learning models to be deployed on it, since there are many training frameworks for deep learning, each framework has a different way and format for storing models, if you want to transplant the trained models by different training frameworks to the same hardware, you need to make additional modifications which is a difficult task. In addition, it is very hard to deploy deep learning algorithms in the limited resources of embedded systems, the real-time of inference, memory limitations of the embedded device, these are important issues to deployment of deep learning on embedded systems.

This paper will use handwriting recognition as an example, take ONNX to store neural network structure and use ONNC to convert the relevant information in the model into parameters of the function in the CMSIS-NN library, and use the provided functions to complete the architecture of the neural network in ONNX, so as to deploy the deep learning model on Cortex-M, in addition we expand new operator, create test process and pass to reminder user where must to be careful.

In order to solve the real-time requirement of embedded system and reduce the memory usage of deep learning on embedded system, in this work we use depthwise seperable convolution to reduce computation and take maxpool to downsampling, then compare the memory usage , accuracy with MoblieNet, finally trade-off between memory and accuracy and reduce inference time.

1、緒論 . . . . . . . . . . . . . 1
2、背景 . . . . . . . . . . . . . 6
2.1、Convolution Neural Network . . . . . . . . . . . . .6
2.2、Open Neural Network Exchange . . . . . . . . . . . . .7
2.3、CMSIS-NN . . . . . . . . . . . . .9
2.4、Open Neural Network Compiler . . . . . . . . . . . . .10
2.5、Mbed . . . . . . . . . . . . .11
2.6、Deep Learning Accelerators . . . . . . . . . . . . .12
3、相關研究 . . . . . . . . . . . . .14
3.1 Enabling deep learning at the IoT edge . . . . . . . . . . . . .14
3.2 TVM . . . . . . . . . . . . .17
3.3 ONNC . . . . . . . . . . . . .17
3.4高效卷積神經網絡 . . . . . . . . . . . . .18
4.實作及改善 . . . . . . . . . . . . .21
4.1開發及實驗環境 . . . . . . . . . . . . .21
4.2神經網路運算元的驗證流程 . . . . . . . . . . . . .22
4.3分離式卷積及其權重的排序 . . . . . . . . . . . . .24
4.4 Calibration 的計算流程 . . . . . . . . . . . . .26
4.5警示功能 . . . . . . . . . . . . .29
4.6 深度學習模型的優化(MNIST). . . . . . . . . . . . .31
5. 實驗結果 . . . . . . . . . . . . .34
6. 結論 . . . . . . . . . . . . .37
7. 參考文獻 . . . . . . . . . . . . .39

[1] Market for Cloud AI Chipsets to Double in Next 5yrs.Retrieved June 3, 2020,from https://www.eetasia.com/market-for-cloud-ai-chipsets-to-double-in-next-5yrs/.
[2] Deep Learning Frameworks. Retrieved June 17, 2020,fromhttps://makerpro.cc/2018/06/deep-learning-frameworks/.
[3] Girshick, R. B., Donahue, J., Darrell, T., & Malik, J. (2013). Rich feature hierarchies for accurate object detection and semantic segmentation (v4). CoRR.
[4] Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)
[5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in NIPS, 2012.
[6] A. Singla, L. Yuan and T. Ebrahimi, "Food/non-food image classification and food categorization using pre-trained googlenet model", Proc. Int. Workshop Multimedia Assist. Dietary Manage., pp. 3-11, 2016
[7] A. Krizhevsky. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 2010.
[8] Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. 2017. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), pages 496–505.
[9] Song S, Huang H, Ruan T (2018) Abstractive text summarization using LSTM-CNN based deep learning.
[10] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
[11] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015.
[12] J. Chen and X. Ran, "Deep learning with edge computing: A review", Proc. IEEE, vol. 107, no. 8, pp. 1655-1674, Aug. 2019.
[13] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, November 1998a
[14] CNN Architecture. Retrieved June 5, 2020, from https://ppt.cc/fC5enx.
[15] ONNX. Retrieved June 5, 2020, from https://github.com/onnx/onnx.
[16] ONNX and Azure Machine Learning: Build and accelerate ML models. Retrieved June 10, 2020, from https://docs.microsoft.com/zh-tw/azure/machine-learning/concept-onnx.
[17] MNIST ONNX. Retrieved June 12,2020,from https://github.com/onnx/models/blob/master/vision/classification/mnist/model/mnist-8.onnx.
[18] L. Lai and N. Suda, "Enabling deep learning at the IoT edge", Proc. Int. Conf. Comput.-Aided Design (ICCAD), pp. 135, 2018.
[19] ONNC Utilities, [online] Available: Retrieved June 5, 2020, from
https://github.com/ONNC/onnc/blob/master/docs/ONNC-Utilities.md.
[20] Arm Mbed OS 5. Retrieved June 5, 2020, from https://os.mbed.com/docs/mbed-os/v5.15/introduction/index.html.
[21] N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers et al., "In-datacenter performance analysis of a tensor processing unit", Computer Architecture (ISCA) 2017 ACM/IEEE 44th Annual International Symposium on, pp. 1-12, 2017.

[22] A mini guide on selecting the right computing platform for your cloud applications. Retrieved June 17, 2020, from https://inaccel.com/cpu-gpu-fpga-or-tpu-which-one-to-choose-for-my-machine-learning-training/.
[23]Chao Li, Yi Yang, Min Feng, Srimat Chakradhar and Huiyang Zhou, "Optimizing memory efficiency for deep convolutional neural networks on GPUs", High Performance Computing Networking Storage and Analysis SC16: International Conference for, pp. 633-644, 2016.
[24] T. Chen, T. Moreau, Z. Jiang, H. Shen, E. Q. Yan, L. Wang, et al., "TVM: end-to-end optimization stack for deep learning", CoRR, vol. abs/1802.04799, 2018.
[25] Wei-Fen Lin, Cheng-Tao Hsieh,Cheng-Yi Chou, "ONNC: A Compilation Framework Connecting ONNX to Proprietary Deep Learning Accelerators", 2019 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)
[26] ONNC. Retrieved June 17,2020,from, https://github.com/ONNC/onnc.
[27] N. Rotem, J. Fix, S. Abdulrasool, S. Deng, R. Dzhabarov, J. Hegeman, et al., "Glow: Graph lowering compiler techniques for neural networks", CoRR, vol. abs/1805.00907, 2018.
[28] L. Sifre. Rigid-motion scattering for image classification. PhD thesis, Ph. D. thesis, 2014.
[29] Chollet, F.: Xception: Deep learning with depthwise separable convolutions. arXiv preprint (2016).
[30] Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:1707.01083 (2017).
[31] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. arXiv preprint arXiv:1801.04381 (2018).
[32] Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[33] CMSIS open source. Retrieved June 3, 2020,from https://github.com/ARM-software/CMSIS_5.
[34] JSON for modern C++. Retrieved June 17, 2020,from https://github.com/nlohmann/json.
[35] ONNC’s pass manager Retrieved June 15, 2020,from https://onnc.ai/.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top