( 您好!臺灣時間:2022/12/04 20:30
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


論文名稱(外文):To Assist Real Time Detection For Clothing Style and Accessories Based on Deep Learning YOLOv3
指導教授(外文):WANG, CHEN-SHU
外文關鍵詞:Instant detectionDeep learningYOLOv3Clothing identification
  • 被引用被引用:0
  • 點閱點閱:209
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本研究使用YOLOv3深度學習物件偵測模型,來建置即時服裝與配件偵測系統,首先使用CNN整併K-means聚類方法找出anchor box合適的大小為(81x42):(84x78):(94x120) (117x156):(133x177):(144x190) (160x216):(231x270):(263x274)。接著透過預訓練模型凍結前6層模型參數,加上調整學習率為0.001與使用22000筆訓練資料量等實驗設定後,找出最適合於本研究模型之權衡(trade off),最終模型偵測時將IOU(Intersection Over Union)、NMS(Non-Maximum Suppression)皆設為0.5,且confidence為0.8的情況下達到了78.8%(mAP),且在偵測的速度上為每秒125張,藉此達到即時偵測相對應的關鍵畫面。

When there are criminal cases, searching for missing persons, etc., the first action is often to read the surveillance video recorder. Undeniably, real-time image information is an important basis for solving the case. The processing of this kind of information still relies on a large number of humans to observe the surveillance video screen and search for whether a specific target (such as a missing person or a dangerous person) appears on the screen, but it takes a lot of time to focus on the video screen, there are also It may be that the fatigue caused by the uninterrupted observation caused the missing of the key pictures. With the increasingly sophisticated recognition technology for object detection in recent years, the speed and accuracy of object recognition have been greatly improved. With the ubiquitous monitoring equipment, recognition technology has gradually developed towards the application of real-time detection. Among them, the clothing recognition system has a wide range of applications. However, in the existing literature, there is no special condition through clothing and accessories. The images of wearing clothing and accessories are viewed in the image. Therefore, this study uses real-time detection of clothing and accessories. Types of accessories, to find the key pictures in the image (such as: missing people or investigating suspects), shorten the search time and reduce the omissions caused by human eye fatigue.
This research uses the YOLOv3 deep learning object detection model to build a real-time clothing and accessories detection system. First, the CNN consolidation K-means clustering method is used to find the appropriate size of the anchor box is (81x42):(84x78):( 94x120) (117x156): (133x177): (144x190) (160x216): (231x270): (263x274).After freezing the parameters of the first 6 layers of the model through the pre-training model, plus adjusting the learning rate to 0.001 and using 22,000 training data amounts and other experimental settings, find the trade-off that is most suitable for the model in this study, and finally the model detection When IOU (Intersection Over Union) and NMS (Non-Maximum Suppression) are both set to 0.5, and the confidence is 0.8, it reaches 78.8% (mAP), and the detection speed is 125 shots per second. Achieve the key picture corresponding to real-time detection.
In the era of the Internet of Things, the number of surveillance cameras will inevitably increase. If you can effectively use the video screen, you will change the daily life mode. For example, through the detection mask and the access control system, you can judge whether the person has a mask. In addition, your own monitoring equipment can also use this system to increase environmental safety, detect wearing strange clothes or holding dangerous goods, and issue warning notices. In addition, you can analyze the person's occupation or role by wearing the type and style of clothing, and then recommend products of the same occupation or role preference based on the analysis results.


摘要 i
誌謝 v
目錄 vi
表目錄 viii
圖目錄 ix
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 6
1.3 研究架構 8
1.4 研究範圍與限制 10
第二章 文獻探討 11
2.1 深度學習 11
2.2 卷積神經網路 15
2.3 物件偵測相關文件 20
2.4 服裝辨識 27
第三章 研究設計與方法 28
3.1 系統流程 28
3.2 資料預備 29
3.3 建立和訓練偵測模型 30
3.4 物件偵測演算法 33
3.5 系統架構 35
第四章 實驗設計與分析 36
4.1 實驗環境 36
4.2 資料描述與處理 37
4.3 實驗設計與超參數調整 37
4.3.1 anchor box設定 39
4.3.2預訓練模型之層數調整 41
4.3.3訓練資料量調整 43
4.3.4學習率設定 45
4.3.5影片偵測場景差異 46
4.4 實驗結果分析 47
第五章 結論 54
5.1 結論 54
5.2 未來展望 55
參考文獻 57


3.Bengio, Y., “Practical Recommendations for Gradient-Based Training of Deep Architectures,” 2012.
4.Bengio, Y., “Learning deep architectures for AI,” Foundations and trends® in Machine Learning, vol. 2, pp. 1-127, 2009.
5.Borji, A., Cheng, M.-M., Jiang, H., & Li, J., “Salient object detection: A survey,” ArXiv e-prints, 2014.
6.Bengio. Y., Courville. A., & Vincent, P., “Representation learning: A review and new perspectives,” IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828, 2013.
7.Bhatnagar, S., Ghosal, D., & Kolekar, M. H., “Classification of fashion article images using convolutional neural networks,” Fourth International Conference on Image Information Processing (ICIIP), pp. 357-362, 2017.
8.Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran. J., Catanzaro, B., & Shelhamer, E., “cuDNN: efficient primitives for deep learning,” arXiv:1410.0759, 2014.
9.Chen, Z., Duan, L.-Y., Wang, C., Huang, T., & Gao, W., “Generating vocabulary for global feature representation towards commerce image retrieval,” In International Conference on Image Processing (ICIP), (Brussels), pp. 105-108, 2011.
10.Girshick, R., “Fast r-cnn,” Proceedings of the IEEE International Conference on Computer Vision, 2015.
11.Girshick, R., Donahue, J., Darrell, T., & Malik, J., “Region-based convolutional networks for accurate object detection and segmentation,” TPAMI, 2015.
12.Hidayati, S. C., Goh, T. W., Chan, Ji.-S. G., Hsu, C. C., See, J., Wong, L.-K., Hua, K.-L., Tsao, Y., & Cheng, W.-H.., “Dress with style: learning style from joint deep embedding of clothing styles and body shapes,” In IEEE Transactions on Multimedia, 2020.
13.He, K., Girshick, R., & Dollar, P., “Rethinking ImageNet Pre-training,” In CVPR, 2018.
14.He, K., Zhang, X., Ren, S., & Sun, J., “Deep residual learning for image recognition,” In CVPR, 2016.
15.Ioffe, S., & Szegedy, C., “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” In ICML, 2015.
16.Jabir, A. J., & Joseph, J. T., “A Convolutional Neural Network based Approach for Recognizing Malayalam Handwritten Characters,” International Journal of Scientific and Engineering Research, Volume 9, Issue 12, pp 166-170, 2018.
17.Kingma, D. P., & Ba J., “Wellbore Schematics to Structured Data Using Artificial Intelligence Tools,” In ICLR, 2015.
18.Kemajou, V. N., BAO, A., & Germain O., “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION,” Offshore Technology Conference, 2019.
19.Kiapour, M. H., Han, H., Lazebnik, S., Berg, A. C., & Berg, T. L., “Where to buy it: Matching street clothing photos in online shops,” In ICCV, 2015.
20.Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M., & Tang, P. T. P., “On large-batch training for deep learning: Generalization gap and sharp minima,” In ICLR, 2017.
21.Krizhevsky, A., Sutskever, I., & Hinton, G. E., “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012.
22.Kiapour, M. H., Yamaguchi, K., Berg, A. C., & T. L. Berg, “Hipster wars: Discovering elements of fashion styles,” ECCV, 2014.
23.Liu, W., Anguelov, D., Erhan, D., Szegedy, C., & Reed, S. E., “SSD: Single shot multibox detector,” European Conference on Computer Vision. Springer International Publishing, 2016.
24.LeCun, Y., Bengio, Y., & Hinton, G., “Deep learning,” Nature, 2015.
25.Lin, M., Chen, Q., & Yan, S., “Network in network,” arXiv:1312.4400, 2013.
26.Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S., “Feature pyramid networks for object detection,” In CVPR, 2017.
27.Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., & Yan, S., “Street-toshop: Cross-scenario clothing retrieval via parts alignment and auxiliary set,” In CVPR, 2012.
28.Liu, Z., Luo, P., Qiu, S., Wang, X., & Tang, X., “Deepfashion: Powering robust clothes recognition and retrieval with rich annotations,” In CVPR, pages 1096-1104, 2016.
29.Li, F., Li, Y., & Zhang, X., “A novel approach to cloth classification through deep neural networks,” International Conference on Security, Pattern Analysis, and Cybernetics (SPAC), pp. 368-371, 2017.
30.Nagao, S., Takahashi, S., & Tanaka, J., “Mirror appliance: recommendation of clothes coordination in daily life,” in Human Factors in Telecommunication (HFT) Symposium, (Malaysia), pp. 367-374, 2008.
31.Pinheiro, P. H., & Collobert, R., “Recurrent convolutional neural networks for scene labeling,” In ICML, 2014.
32.Redmon, J., Divvala, S., Girshick, R., & Farhadi, A., “You only look once: Unified, real-time object detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
33.Redmon, J., & Farhadi, A., “YOLO9000: Better, faster, stronger,” In CVPR, 2017.
34.Redmon, J., & Farhadi, A., “Yolov3: An incremental improvement,” CoRR, vol. abs/1804.02767, 2018.
35.Rothe, R., Guillaumin, M., & Gool, L. V., “Non-maximum suppression for object detection by passing messages between windows,” In Asian Conference on Computer Vision, pages 290–306. Springer, 2014.
36.Ren, S., He, K., Girshick, R., & Sun, J., “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, 2015.
37.Soriano, D., Aguilar, C., Ramirez-Morales, I., Tusa, E., Rivas, W., & Pinta, M., “Mammogram Classification Schemes by Using Convolutional Neural Networks,” in Communications in Computer and Information Science. Springer, Cham, vol. 798, pp. 71-85. I, 2018.
38.Shih, Y.-S., Chang, K.-Y., Lin, H.-T., & Sun, M., “Compatibility Family Learning for Item Recommendation and Generation,” arXiv preprint arXiv:1712.01262, 2017.
39.Sutskever, I., Martens, J., Dahl, G., & Hinton, G., “On the importance of initialization and momentum in deep learning,” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1139-1147, 2013.
40.Sun, C., Shrivastava, A., Singh, S., & Gupta, A., “Revisiting Unreasonable Effectiveness of Data in Deep Learning Era,” In CVPR, 2017.
41.Simonyan , K., & Zisserman, A., “Very deep convolutional networks for large-scale image recognition,” In ICLR, 2015.
42.Uijlings, J. R. R., Sande, K. E. A. V. D., Gevers, T., & Smeulders, A. W. M., “Selective search for object recognition,” IJCV, 2013.
43.Wu, Y., & He, K., “Group Normalization,” In CVPR, 2018.
44.Xu, B., Wang, N., Chen, T., & Li, M., “Empirical evaluation of rectified activations in convolutional network,” arXiv preprint arXiv:1505.00853, 2015.
45.Yang, M., & Yu, K., “Real-time clothing recognition in surveillance videos,” In ICIP, 2011.
46.Yang, T., Zhang, X., Li, Z., Zhang, W., & Sun, J., “MetaAnchor: Learning to Detect Objects with Customized Anchors,” In CVPR, 2018.
47.Zhang, W., Begole, B., Chu, M., Liu, J., & Yee, N., “Real-time clothes compar-ison based on multi-view vision,” in International Conference on Distributed Smart Cameras (ICDSC), pp. 1-10, 2008.
48.Zeiler, M. D., & Fergus, R., “Visualizing and understanding convolutional networks,” in Computer vision–ECCV 2014, ed: Springer, pp. 818-833, 2014.
49.Zaccone, G., Getting Started with TensorFlow, “Packt Publishing,” 2016.
50.Zitzewitz, G. V., “Survey of neural networks in autonomous driving,” 2017.

電子全文 電子全文(網際網路公開日期:20250629)
第一頁 上一頁 下一頁 最後一頁 top