跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.87) 您好!臺灣時間:2024/12/04 16:20
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:邱之宇
研究生(外文):Jr-Yu Chiou
論文名稱:TinyissimoYOLOv5-P4-DA:基於深度剪枝、輔助網路和量化的物件偵測模型
論文名稱(外文):TinyissimoYOLOv5-P4-DA: A Depth Pruning, Auxiliary Network, and Quantization-Based Object Detection Model
指導教授:陳慶瀚陳慶瀚引用關係
指導教授(外文):Ching-Han Chen
學位類別:碩士
校院名稱:國立中央大學
系所名稱:人工智慧國際碩士學位學程
學門:電算機學門
學類:軟體發展學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:英文
論文頁數:84
中文關鍵詞:深度剪枝輔助網路量化物件偵測
外文關鍵詞:Depth PruningAuxiliary NetworkQuantizationObject DetectionTinyMLTFLITE
相關次數:
  • 被引用被引用:0
  • 點閱點閱:19
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
物件偵測技術在計算機視覺領域應用廣泛,但其高計算需求通常依賴強大的硬體支持,對資源有限的微控制器是一大挑戰。本研究基於 YOLOv5 及 TinyissimoYOLO ,提出了一種改進的 TYv5-P4 模型,通過深度剪枝、輔助網路以及量化,成功將模型大小縮小至 334KiB,並命名為 TYv5-P4-DA 。在低解析度輸入的情況下,該模型仍能保持相對較高的準確度。
TYv5-P4-DA 的創新之處在於其 Backbone 僅保留三個 C3 層,並只使用單一輸出。這一方法不僅能在較低解析度輸入下提升準確度,還能有效減少模型大小。此外,與 TinyissimoYOLO 相比, TinyissimoYOLO 的 mAP 會隨輸入尺寸增加而下降,而 TYv5-P4-DA 的 mAP 則會隨輸入尺寸增加而提升。該模型採用高解析度圖像進行訓練,低解析度圖像進行推論,有效提高了物件偵測的準確性。
這一成果為低功耗、低成本的 TinyML 應用提供了新的可能性,並具有廣泛的實際應用價值。未來工作將集中於進一步優化模型性能,提升準確度和推理速度,以滿足更多實際應用場景的需求。
Object detection technology is extensively applied in the field of computer vision, yet its high computational requirements typically depend on robust hardware support, posing a significant challenge for resource-constrained microcontrollers. This research introduces an improved TYv5-P4 model based on YOLOv5 and TinyissimoYOLO. Through Depth Pruning, Auxiliary Networks and Quantization, the model size is successfully reduced to 334KiB, and it is named TYv5-P4-DA. This model maintains relatively high accuracy even with low-resolution inputs.
The innovation of TYv5-P4-DA lies in its Backbone, which retains only three C3 layers and uses only P4 as the output. This method not only enhances accuracy with lower-resolution inputs but also effectively reduces the model size. Furthermore, unlike TinyissimoYOLO, which experiences a decline in mAP as input size increases, TYv5-P4-DA's mAP improves with larger input sizes. The model is trained with high-resolution images and performs inference with low-resolution images, significantly enhancing object detection accuracy.
This accomplishment provides new possibilities for low-power, low-cost TinyML applications and possesses broad practical value. Future work will focus on further optimizing model performance, enhancing accuracy, and speeding up inference to meet the demands of more practical application scenarios.
摘要 i
Abstract ii
Acknowledgments iii
Table of Contents iv
List of Figures vi
List of Tables viii
Chapter 1 Introduction 1
1.1 Research Background 1
1.2 Thesis Structure 3
Chapter 2 Related Work 4
2.1 TinyML 4
2.1.1 TensorFlow Lite (TFLITE) 5
2.1.2 Quantization 6
2.2 TinyissimoYOLO 9
2.2.1 Benchmark 9
2.2.2 Advantages and Disadvantages of TinyissimoYOLO 9
2.3 Depth Pruning with Auxiliary Networks 11
2.3.1 Unstructured Pruning and Structured Pruning 12
2.3.2 Depth Pruning with Auxiliary Networks 13
2.4 YOLOv5 14
2.4.1 Convolution Mudule - ConvBNSiLU - CBS 19
2.4.2 CSPNet 19
2.4.3 SPPF (Spatial Pyramid Pooling Fast) 20
2.4.4 Mosaic Augmentation 22
2.4.5 CIoU Loss 22
2.4.6 Focal Loss 28
2.4.7 NMS 30
2.4.8 mAP 32
Chapter 3 TinyissimoYOLOv5-P4-DA 39
3.1 TinyissimoYOLOv5-P4-DA (TYv5-P4-DA) 39
3.2 Pratrain TYv5-P4 41
3.3 Transfer Learning 50
3.4 Depth Pruning and Auxiliary Network Learning 51
3.5 Quantization by Converting to TF Lite 55
Chapter 4 Experiments 56
4.1 Experimental Environment 56
4.2 Benchmark 56
4.3 Depth Pruning and Auxiliary Networks 57
4.4 Training with large images, inferring with small images 59
4.5 Creating pretrained weights using COCO 60
4.6 Comparison of inference results with different image sizes 64
Chapter 5 Conclusion 66
Chapter 6 References 67
[1] Y. Abadade, A. Temouden, H. Bamoumen, N. Benamar, Y. Chtouki, and A. S. Hafid, "A comprehensive survey on tinyml," IEEE Access, 2023.
[2] B. Murdoch, "Privacy and artificial intelligence: challenges for protecting health information in a new era," BMC Medical Ethics, vol. 22, pp. 1-5, 2021.
[3] Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, "Object detection in 20 years: A survey," Proceedings of the IEEE, vol. 111, no. 3, pp. 257-276, 2023.
[4] Ultralytics. (March 13). Available: https://github.com/ultralytics/ultralytics
[5] J. Lin, L. Zhu, W.-M. Chen, W.-C. Wang, C. Gan, and S. Han, "On-device training under 256kb memory," Advances in Neural Information Processing Systems, vol. 35, pp. 22941-22954, 2022.
[6] R. David, J. Duke, A. Jain, V. Janapa Reddi, N. Jeffries, J. Li, N. Kreeger, I. Nappier, M. Natraj, and T. Wang, "Tensorflow lite micro: Embedded machine learning for tinyml systems," Proceedings of Machine Learning and Systems, vol. 3, pp. 800-811, 2021.
[7] E. Impulse. (March 13). Available: https://edgeimpulse.com/
[8] E. Impulse. (March 2024). FOMO: Object detection for constrained devices. Available: https://docs.edgeimpulse.com/docs/edge-impulse-studio/learning-blocks/object-detection/fomo-object-detection-for-constrained-devices
[9] J. Moosmann, M. Giordano, C. Vogt, and M. Magno, "Tinyissimoyolo: A quantized, low-memory footprint, tinyml object detection network for low power microcontrollers," in 2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS), pp. 1-5, 2023.
[10] J. Moosmann, H. Mueller, N. Zimmerman, G. Rutishauser, L. Benini, and M. Magno, "Flexible and fully quantized ultra-lightweight tinyissimoyolo for ultra-low-power edge systems," arXiv preprint arXiv:2307.05999, 2023.
[11] C. White, M. Safari, R. Sukthanker, B. Ru, T. Elsken, A. Zela, D. Dey, and F. Hutter, "Neural architecture search: Insights from 1000 papers," arXiv preprint arXiv:2301.08727, 2023.
[12] Ultralytics. (March 13). YOLOv5. Available: https://github.com/ultralytics/yolov5
[13] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[14] D. Blalock, J. J. Gonzalez Ortiz, J. Frankle, and J. Guttag, "What is the state of neural network pruning?," Proceedings of machine learning and systems, vol. 2, pp. 129-146, 2020.
[15] Huggingface. (March 13). Quantization. Available: https://huggingface.co/docs/optimum/concept_guides/quantization
[16] S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, "Image segmentation using deep learning: A survey," IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3523-3542, 2021.
[17] TensorFlow. (March 13). Visual Wake Words with TensorFlow Lite Micro. Available: https://blog.tensorflow.org/2019/10/visual-wake-words-with-tensorflow-lite_30.html
[18] (March 13). MCUs Expected to Make Modest Comeback After 2020 Drop. Available: https://www.icinsights.com/news/bulletins/mcus-expected-to-make-modest-comeback-after-2020-drop--/
[19] TensorFlow. (March 13). TensorFlow Lite Model conversion overview. Available: https://www.tensorflow.org/lite/models/convert
[20] (March 13). TensorFlow lite interpreter "tf.lite.Interpreter". Available: https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter
[21] (March 13). TensorFlow Lite inference. Available: https://www.tensorflow.org/lite/guide/inference
[22] (March 14). Quantization aware training of TensorFlow. Available: https://www.tensorflow.org/model_optimization/guide/quantization/training
[23] TensorFlow. (March 14). Post-training quantization of TensorFlow. Available: https://www.tensorflow.org/lite/performance/post_training_quantization
[24] (March 14). The PASCAL Visual Object Classes Homepage. Available: http://host.robots.ox.ac.uk/pascal/VOC/
[25] P. Henderson and V. Ferrari, "End-to-end training of object class detectors for mean average precision," in Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part V 13, pp. 198-213, 2017.
[26] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, "Xlnet: Generalized autoregressive pretraining for language understanding," Advances in neural information processing systems, vol. 32, 2019.
[27] D. H. Dario Amodei. (March 14). AI and compute. Available: https://openai.com/research/ai-and-compute
[28] J. Gou, B. Yu, S. J. Maybank, and D. Tao, "Knowledge distillation: A survey," International Journal of Computer Vision, vol. 129, no. 6, pp. 1789-1819, 2021.
[29] Y. He and L. Xiao, "Structured pruning for deep convolutional neural networks: A survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
[30] Z. Liao, V. Quétu, V.-T. Nguyen, and E. Tartaglione, "Can Unstructured Pruning Reduce the Depth in Deep Neural Networks?," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1402-1406, 2023.
[31] L. Chen, Y. Chen, J. Xi, and X. Le, "Knowledge from the original network: restore a better pruned network with knowledge distillation," Complex & Intelligent Systems, pp. 1-10, 2021.
[32] (July 20). Awesome-Pruning. Available: https://github.com/he-y/Awesome-Pruning
[33] J. D. De Leon and R. Atienza, "Depth pruning with auxiliary networks for tinyml," in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3963-3967, 2022.
[34] F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He, "A comprehensive survey on transfer learning," Proceedings of the IEEE, vol. 109, no. 1, pp. 43-76, 2020.
[35] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
[36] (March 14). Ultralytics | Revolutionizing the World of Vision AI. Available: https://www.ultralytics.com/
[37] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117-2125, 2017.
[38] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, "Path aggregation network for instance segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759-8768, 2018.
[39] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial pyramid pooling in deep convolutional networks for visual recognition," IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 9, pp. 1904-1916, 2015.
[40] M. Qiu, L. Huang, and B.-H. Tang, "ASFF-YOLOv5: Multielement detection method for road traffic in UAV images based on multiscale feature fusion," Remote Sensing, vol. 14, no. 14, p. 3498, 2022.
[41] C. S. Wiki. (March 14). File:MaxpoolSample2.png - Computer Science Wiki. Available: https://computersciencewiki.org/index.php/File:MaxpoolSample2.png
[42] Ultralytics. (March 14). Github of Mosaic augmentation. Available: https://github.com/ultralytics/ultralytics/blob/5c1277113b19e45292c01e5a47aa2bdb6ebc98d0/ultralytics/data/augment.py#L133
[43] Ultralytics. (March 14). mosaic augmentation #1423.ultralytics/yolov5. Available: https://github.com/ultralytics/yolov5/issues/1423#issuecomment-1093947259
[44] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal loss for dense object detection," in Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
[45] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, "Microsoft coco: Common objects in context," in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740-755, 2014.
[46] C.-H. Chen, M.-Y. Lin, and X.-C. Guo, "High-level modeling and synthesis of smart sensor networks for Industrial Internet of Things," Computers & Electrical Engineering, vol. 61, pp. 48-66, 2017.
[47] P. Micikevicius, S. Narang, J. Alben, G. Diamos, E. Elsen, D. Garcia, B. Ginsburg, M. Houston, O. Kuchaiev, and G. Venkatesh, "Mixed precision training," arXiv preprint arXiv:1710.03740, 2017.
[48] ONNX. (06/11). Convert a PyTorch model to Tensorflow using ONNX. Available: https://github.com/onnx/tutorials/blob/main/tutorials/PytorchTensorflowMnist.ipynb
[49] TensorFlow. (06/11). Convert TF Object Detection API model to TFLite.ipynb. Available: https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/colab_tutorials/convert_odt_model_to_TFLite.ipynb#scrollTo=-ecGLG_Ovjcr
[50] C. Michaelis, B. Mitzkus, R. Geirhos, E. Rusak, O. Bringmann, A. S. Ecker, M. Bethge, and W. Brendel, "Benchmarking robustness in object detection: Autonomous driving when winter is coming," arXiv preprint arXiv:1907.07484, 2019.
[51] J. Moosmann, P. Bonazzi, Y. Li, S. Bian, P. Mayer, L. Benini, and M. Magno, "Ultra-efficient on-device object detection on ai-integrated smart glasses with tinyissimoyolo," arXiv preprint arXiv:2311.01057, 2023.
電子全文 電子全文(網際網路公開日期:20290722)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top