跳到主要內容

臺灣博碩士論文加值系統

(44.201.97.138) 您好!臺灣時間:2024/09/20 16:28
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:廖婉晴
研究生(外文):LIAO,WAN-CHING
論文名稱:FCED-CLIP演算法對人類草圖生成之研究
論文名稱(外文):A Research of FCED-CLIP Algorithm on Human Sketch Generation
指導教授:劉遠楨劉遠楨引用關係
指導教授(外文):Liu,Yuan-Chen
口試委員:廖文宏陳國棟劉遠楨
口試委員(外文):Liao,Wen-HungChen,Gwo-DongLiu,Yuan-Chen
口試日期:2024-06-12
學位類別:碩士
校院名稱:國立臺北教育大學
系所名稱:資訊科學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:中文
論文頁數:45
中文關鍵詞:卷積神經網路資料增強草圖生成CLIP
外文關鍵詞:Convolutional Neural NetworksData AugmentationSketch GenerationCLIP
相關次數:
  • 被引用被引用:0
  • 點閱點閱:21
  • 評分評分:
  • 下載下載:5
  • 收藏至我的研究室書目清單書目收藏:0
隨著網路技術和行動裝置的廣泛應用,圖片已成為重要的訊息傳遞方式。原本傳統的CBIR,以關鍵字檢索圖片已經無法滿足現代人對於快速便利的需求,因此,使用草圖作為檢索方式的SBIR開始興起。然而,由於草圖訓練資料的不足,導致 SBIR 研究進展緩慢。
為了解決這個問題,本研究提出了 FCED-CLIP 模型,利用資料增強技術生成多樣化且高品質的草圖。 FCED-CLIP 模型在Top-1和Top-5分數分別達到66.16% 和81.94%,這證明 FCED-CLIP 模型生成的草圖,能夠有效地擴展 SBIR 的草圖數據集,進而加快 SBIR 研究的速度。

With the widespread application of internet technology and mobile devices, images have become an important means of information transmission. The traditional CBIR, which uses keyword searches for images, no longer meets the modern need for quick and convenient access. Therefore, the emergence of SBIR, which uses sketches as a retrieval method, has begun to rise. However, due to the insufficient training data for sketches, the progress of SBIR research has been slow.
To solve this problem, this study introduces the FCED-CLIP model, which employs data augmentation techniques to produce diverse and high-quality sketches. The FCED-CLIP model achieves Top-1 and Top-5 scores of 66.16% and 81.94%, respectively. This proves that the sketches generated by the FCED-CLIP model can effectively expand the sketch dataset of SBIR, thereby accelerating SBIR research.

1. 緒論 1
1.1 研究背景與動機 1
1.2 過去草圖生成研究 3
1.3 研究目的 4
2. 文獻探討 5
2.1 卷積神經網路(CNN) 5
2.2 反卷積 6
2.3 自動編碼器 7
2.4 AlexNet 8
2.5 VGGNet 9
2.6 ResNet 10
2.7 CLIP 11
2.8 AdaIN 14
2.9 FCED架構 15
2.9.1 跳躍連接 16
2.9.2 條件輸入 16
2.9.3 感知損失 17
3. 研究方法 19
3.1 研究架構 19
3.1.1 FCED 20
3.1.2 CLIP損失 27
3.2 數據資料集 31
3.3 資料增強 32
3.4 訓練流程 32
3.5 程式虛擬碼 33
3.6 預測流程 34
4. 實驗結果 35
4.1 實驗流程 35
4.2 ResNet分類器 36
4.3 分類評估指標 36
4.4 實驗環境 37
4.5 實驗評估結果分析 37
4.6 主觀評估 39
5. 結論與未來展望 42
5.1 結論 42
5.2 未來展望 42
參考文獻 43

[1]H. Po-Whei and L. Chu-Hui, "Image Database Design Based on 9D-SPA Representation for Spatial Relations," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 12, pp. 1486-1496, 2004.
[2]Y. Liu, D. Zhang, G. Lu, and W.-Y. Ma, "A Survey of Content-based Image Retrieval With High-level Semantics," Pattern Recognition, vol. 40, no. 1, pp. 262-282, 2007/01/01/ 2007.
[3]Y. Liu, X. Chen, C. Zhang, and A. Sprague, "Semantic Clustering for Region-Based Image Retrieval," in Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007), 10-12 Dec. 2007 2007, pp. 167-172.
[4]M. Broilo and F. G. B. D. Natale, "A Stochastic Approach to Image Retrieval Using Relevance Feedback and Particle Swarm Optimization," IEEE Transactions on Multimedia, vol. 12, no. 4, pp. 267-277, 2010.
[5]F. Yang, N. A. Ismail, Y. Y. Pang, V. R. Kebande, A. Al-Dhaqm, and T. W. Koh, "A Systematic Literature Review of Deep Learning Approaches for Sketch-Based Image Retrieval: Datasets, Metrics, and Future Directions," IEEE Access, vol. 12, pp. 14847-14869, 2024.
[6]X. Sun, C. Wang, C. Xu, and L. Zhang, "Indexing billions of images for sketch-based retrieval," presented at the Proceedings of the 21st ACM international conference on Multimedia, Barcelona, Spain, 2013.
[7]K. Pang, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song, "Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval," in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13-19 June 2020 2020, pp. 10344-10352.
[8]Y. Li, Y.-Z. Song, T. Hospedales, and S. Gong, "Free-hand Sketch Synthesis with Deformable Stroke Models," p. arXiv:1510.02644.
[9]D. Ha and D. Eck, "A Neural Representation of Sketch Drawings," p. arXiv:1704.03477.
[10]S. Ge, V. Goswami, C. L. Zitnick, and D. Parikh, "Creative Sketch Generation," p. arXiv:2011.10039.
[11]Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[12]M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, "Deconvolutional networks," in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 13-18 June 2010 2010, pp. 2528-2535.
[13]V. Dumoulin and F. Visin, "A guide to convolution arithmetic for deep learning," p. arXiv:1603.07285.
[14]D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533-536, 1986/10/01 1986.
[15]J. Masci, U. Meier, D. Cireşan, and J. Schmidhuber, "Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction," in Artificial Neural Networks and Machine Learning – ICANN 2011, Berlin, Heidelberg, T. Honkela, W. Duch, M. Girolami, and S. Kaski, Eds., 2011// 2011: Springer Berlin Heidelberg, pp. 52-59.
[16]D. Bank, N. Koenigstein, and R. Giryes, "Autoencoders," p. arXiv:2003.05991.
[17]A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Commun. ACM, vol. 60, no. 6, pp. 84–90, 2017.
[18]K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," p. arXiv:1409.1556.
[19]S. P. K. Wickrama Arachchilage and E. Izquierdo, "Deep-learned faces: a survey," EURASIP Journal on Image and Video Processing, vol. 2020, no. 1, p. 25, 2020/06/29 2020.
[20]K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27-30 June 2016 2016, pp. 770-778.
[21]A. Radford et al., "Learning Transferable Visual Models From Natural Language Supervision," p. arXiv:2103.00020.
[22]A. Vaswani et al., "Attention Is All You Need," p. arXiv:1706.03762.
[23]O. Patashnik, Z. Wu, E. Shechtman, D. Cohen-Or, and D. Lischinski, "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery," in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 10-17 Oct. 2021 2021, pp. 2065-2074.
[24]G. Kwon and J. C. Ye, "CLIPstyler: Image Style Transfer with a Single Text Condition," p. arXiv:2112.00374.
[25]Y. Vinker et al., "CLIPasso: Semantically-Aware Object Sketching," p. arXiv:2202.05822.
[26]P. Schaldenbrand, Z. Liu, and J. Oh, "StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation," p. arXiv:2202.12362.
[27]S. Ioffe and C. Szegedy, "Batch normalization: accelerating deep network training by reducing internal covariate shift," presented at the Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, Lille, France, 2015.
[28]D. Ulyanov, A. Vedaldi, and V. Lempitsky, "Instance Normalization: The Missing Ingredient for Fast Stylization," p. arXiv:1607.08022.
[29]X. Huang and S. Belongie, "Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization," in 2017 IEEE International Conference on Computer Vision (ICCV), 22-29 Oct. 2017 2017, pp. 1510-1519.
[30]M. Kampelmühler and A. Pinz, "Synthesizing human-like sketches from natural images using a conditional convolutional decoder," p. arXiv:2003.07101.
[31]O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," p. arXiv:1505.04597.
[32]M. Mirza and S. Osindero, "Conditional Generative Adversarial Nets," p. arXiv:1411.1784doi: 10.48550/arXiv.1411.1784.
[33]P. Christoffersen and K. Jacobs, "The importance of the loss function in option valuation," Journal of Financial Economics, vol. 72, no. 2, pp. 291-318, 2004/05/01/ 2004.
[34]J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual Losses for Real-Time Style Transfer and Super-Resolution," p. arXiv:1603.08155.
[35]P. Sangkloy, N. Burnell, C. Ham, and J. Hays, "The sketchy database: learning to retrieve badly drawn bunnies," ACM Transactions on Graphics, vol. 35, pp. 1-12, 07/11 2016.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊