跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.86) 您好!臺灣時間:2025/02/07 19:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳家儀
研究生(外文):Chia-Yi Chen
論文名稱:CORE-MAP: 多核邊緣設備上加速 CNN 推理的分佈式特徵圖處理
論文名稱(外文):CORE-MAP: Feature Map Distributed Processing for Accelerating CNN Inference on Multi-Core Edge Devices
指導教授:陳雅淑
指導教授(外文):Ya-Shu Chen
口試委員:謝仁偉吳晉賢曾學文
口試委員(外文):Jen-Wei HsiehChin-Hsien WuHsueh-Wen Tseng
口試日期:2023-07-26
學位類別:碩士
校院名稱:國立臺灣科技大學
系所名稱:電機工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:英文
論文頁數:40
中文關鍵詞:邊緣運算分散式推理神經網路
外文關鍵詞:Edge computingDistributed inferenceNeural networks
相關次數:
  • 被引用被引用:0
  • 點閱點閱:92
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
分散式邊緣運算相較於雲端運算而言能夠提供較低的傳輸開銷和更高的隱私性,使用上變得越來越受歡迎。然而,每個邊緣裝置的有限運算能力以及對神經網路的高計算需求,使得分散式邊緣運算變得更加困難。在本研究中,我們探討了分散式邊緣推論中的模型裡的層分配、特徵圖切割和運算資源分配。接著,我們提出了名為CORE-MAP的方法,該方法考慮了資料相依性和資源利用情況,將給定的神經網路分配至一組邊緣裝置中。我們對所提出的CORE-MAP進行了評估,實驗結果顯示,與非分散式方法相比,CORE-MAP的性能提升達到了283%。
Distributed edge computing is becoming popular for providing reduced transmission overhead and privacy compared to cloud computing. However, each edge device's limited computing power and the high demand for neural networks make distributed edge computing more difficult. In this study, we explore the layer partition, feature map partition, and computing resources partition in the distributed edge inference. We then propose CORE-MAP, which distributes the given neural network to a set of edge devices with data dependency and resource utilization considerations. The proposed CORE-MAP is evaluated, and experimental outcomes indicate that the CORE-MAP achieves performance enhancement of 283% than the non-distribution approach.
Table of Contents
1 Introduction 1
2 System Model 3
3 Related Work 6
4 Approach 11
4.1 Response Time Analyze 11
4.2 Search Device Partition 12
4.2.1 Determine the number of layers of distribution 12
4.2.2 Split feature map to devices 16
4.3 Search Core Partition 19
5 Performance Evaluation 22
5.1 Experimental Environment 22
5.2 Experimental Result 23
6 Conclusion 27
References 27
References
[1] C.-Y. Yang, J.-J. Kuo, J.-P. Sheu, and K.-J. Zheng, “Cooperative distributed deep neural network deployment with edge computing,” in ICC 2021-IEEE International Conference on Communications, pp. 1–6, IEEE, 2021.
[2] T. Mohammed, C. Joe-Wong, R. Babbar, and M. Di Francesco, “Distributed infer- ence acceleration with adaptive dnn partitioning and offloading,” in IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 854–863, IEEE, 2020.
[3] N. Shan, Z. Ye, and X. Cui, “Collaborative intelligence: Accelerating deep neural network inference via device-edge synergy,” Security and Communication Networks, vol. 2020, pp. 1–10, 2020.
[4] H. Liu, H. Zheng, M. Jiao, and G. Chi, “Scads: Simultaneous computing and distri- bution strategy for task offloading in mobile-edge computing system,” in 2018 IEEE 18th International Conference on Communication Technology (ICCT), pp. 1286– 1290, IEEE, 2018.
[5] Q. Li, L. Huang, Z. Tong, T.-T. Du, J. Zhang, and S.-C. Wang, “Dissec: A distributed deep neural network inference scheduling strategy for edge clusters,” Neurocomput- ing, vol. 500, pp. 449–460, 2022.
[6] H. Zhou, W. Zhang, C. Wang, X. Ma, and H. Yu, “Bbnet: a novel convolutional neural network structure in edge-cloud collaborative inference,” Sensors, vol. 21, no. 13, p. 4494, 2021.
[7] M. Xue, H. Wu, R. Li, M. Xu, and P. Jiao, “Eosdnn: An efficient offloading scheme for dnn inference acceleration in local-edge-cloud collaborative environments,” IEEE Transactions on Green Communications and Networking, vol. 6, no. 1, pp. 248–264, 2021.
[8] S. Kum, S. Oh, J. Yeom, and J. Moon, “Optimization of edge resources for deep learning application with batch and model management,” Sensors, vol. 22, no. 17, p. 6717, 2022.
[9] H.-J. Jeong, H.-J. Lee, C. H. Shin, and S.-M. Moon, “Ionn: Incremental offloading of neural network computations from mobile devices to edge servers,” in Proceedings of the ACM symposium on cloud computing, pp. 401–411, 2018.
[10] H. Liu, W. Zheng, L. Li, and M. Guo, “Loadpart: Load-aware dynamic partition of deep neural networks for edge offloading,” in 2022 IEEE 42nd International Confer- ence on Distributed Computing Systems (ICDCS), pp. 481–491, IEEE, 2022.
[11] J. Wu, L. Wang, Q. Pei, X. Cui, F. Liu, and T. Yang, “Hitdl: High-throughput deep learning inference at the hybrid mobile edge,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 12, pp. 4499–4514, 2022.
[12] R. Yang, Y. Li, H. He, and W. Zhang, “Dnn real-time collaborative inference accelera- tion with mobile edge computing,” in 2022 International Joint Conference on Neural Networks (IJCNN), pp. 01–08, IEEE, 2022.
[13] J. Mao, X. Chen, K. W. Nixon, C. Krieger, and Y. Chen, “Modnn: Local distributed mobile computing system for deep neural network,” in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017, pp. 1396–1401, IEEE, 2017.
[14] Z. Zhao, K. M. Barijough, and A. Gerstlauer, “Deepthings: Distributed adaptive deep learning inference on resource-constrained iot edge clusters,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, pp. 2348– 2359, 2018.
[15] R. Stahl, A. Hoffman, D. Mueller-Gritschneder, A. Gerstlauer, and U. Schlicht- mann, “Deeperthings: Fully distributed cnn inference on resource-constrained edge devices,” International Journal of Parallel Programming, vol. 49, pp. 600–624, 2021.
[16] L. Zeng, X. Chen, Z. Zhou, L. Yang, and J. Zhang, “Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices,” IEEE/ACM Transactions on Networking, vol. 29, no. 2, pp. 595–608, 2020.
[17] J. Mao, Z. Yang, W. Wen, C. Wu, L. Song, K. W. Nixon, X. Chen, H. Li, and Y. Chen, “Mednn: A distributed mobile system with enhanced partition and deployment for large-scale dnns,” in 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 751–756, IEEE, 2017.
[18] J. Du, Y. Du, D. Huang, Y. Lu, and X. Liao, “Enhancing distributed in-situ cnn in- ference in the internet of things,” IEEE Internet of Things Journal, vol. 9, no. 17, pp. 15511–15524, 2022.
[19] S. Zhang, S. Zhang, Z. Qian, J. Wu, Y. Jin, and S. Lu, “Deepslicing: collaborative and adaptive cnn inference with low latency,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 9, pp. 2175–2187, 2021.
[20] J. Zhang, Y. Wang, T. Huang, F. Dong, W. Zhao, and D. Shen, “Thermal-aware on- device inference using single-layer parallelization with heterogeneous processors,” Tsinghua Science and Technology, vol. 28, no. 1, pp. 82–92, 2022.
[21] A. Parthasarathy and B. Krishnamachari, “Defer: Distributed edge inference for deep neural networks,” in 2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS), pp. 749–753, IEEE, 2022.
[22] Y. Xiang and H. Kim, “Pipelined data-parallel cpu/gpu scheduling for multi-dnn real- time inference,” in 2019 IEEE Real-Time Systems Symposium (RTSS), pp. 392–405, IEEE, 2019.
[23] L. Tang, Y. Wang, T. L. Willke, and K. Li, “Scheduling computation graphs of deep learning models on manycore cpus,” arXiv preprint arXiv:1807.09667, 2018.
[24] D. Justus, J. Brennan, S. Bonner, and A. S. McGough, “Predicting the computational cost of deep learning models,” in 2018 IEEE international conference on big data (Big Data), pp. 3873–3882, IEEE, 2018.
[25] A. Mishra, S. Chheda, C. Soto, A. M. Malik, M. Lin, and B. Chapman, “Compoff: A compiler cost model using machine learning to predict the cost of openmp offloading,” in 2022 IEEE International Parallel and Distributed Processing Symposium Work- shops (IPDPSW), pp. 391–400, IEEE, 2022.
[26] Y. G. Kim and C.-J. Wu, “Autoscale: Energy efficiency optimization for stochastic edge inference using reinforcement learning,” in 2020 53rd Annual IEEE/ACM inter- national symposium on microarchitecture (MICRO), pp. 1082–1096, IEEE, 2020.
[27] X. Hou, Y. Guan, T. Han, and N. Zhang, “Distredge: Speeding up convolutional neu- ral network inference on distributed edge devices,” in 2022 IEEE International Par- allel and Distributed Processing Symposium (IPDPS), pp. 1097–1107, IEEE, 2022.
[28] J. Wang, J. Hu, G. Min, W. Zhan, Q. Ni, and N. Georgalas, “Computation offloading in multi-access edge computing using a deep sequential model based on reinforcement learning,” IEEE Communications Magazine, vol. 57, no. 5, pp. 64–69, 2019.
[29] “Jetson nano developer kit.” https://www.nvidia.com/en-us/ autonomous-machines/embedded-systems/jetson-nano/.
[30] “Raspberry pi 4 model b.” https://www.raspberrypi.com/products/ raspberry-pi-4-model-b/.
[31] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[32] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: In- verted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520, 2018.
[33] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the in- ception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826, 2016.
電子全文 電子全文(網際網路公開日期:20280830)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top