跳到主要內容

臺灣博碩士論文加值系統

(44.213.60.33) 您好!臺灣時間:2024/07/20 06:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:吳法川
研究生(外文):WU,FA-CHUAN
論文名稱:增強對於人工智慧生成圖像的偵測,透過色彩空間分析
論文名稱(外文):Color Space Analysis For Detecting Artificial Intelligence-Generated Images
指導教授:阮文齡
指導教授(外文):NGUYEN, VAN-LINH
口試委員:郭建志高宏宇阮文齡
口試委員(外文):KUO, JIAN-JHIHKAO, HUNG-YUNGUYEN, VAN-LINH
口試日期:2024-06-21
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:英文
論文頁數:48
中文關鍵詞:深度偽造擴散模型卷積神經網絡
外文關鍵詞:DeepfakeDiffusion modelCNN
相關次數:
  • 被引用被引用:0
  • 點閱點閱:18
  • 評分評分:
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
隨著科技的發展,用來生成圖片的模型能力越來越強,以至於人眼很難區分生成的圖片和真實的圖片,生成圖片甚至會被用於犯罪行為,所以有效的辨識圖片是否是真實的已是現今社會重要的問題。目前常見的生成模型從自編碼器(Autoencoder)、生成對抗網路(Generative Adversarial Network)以及最新的擴散模型(Diffusion Model),如今監督式學習的辨識器已經能夠很好地區分學習過的生成圖片,但辨識不同模型架構生成圖片的能力還有待加強,由於在現實的應用中無從判斷圖片是由哪一種模型所生成,所以對於辨識器的泛用性能力是至關重要的部分。
在這項工作中,我們使用YCrCb和HSV兩種色彩空間的模型來提高對真實圖像的信心程度,同時測試了過往對生成對抗網路模型有效地前處理是否依然能作用在擴散模型以及更多樣的過濾器和特別的閾值調整方法來提高辨識器的泛化能力。我們使用的數據集為Genimage數據集,該數據集包含多個擴散模型生成的圖片。我們的結果顯示,我們在未學習過得子集上增強了10%,並且方法具備可移植性以應用在其他基底模型。
In the era of Artificial Intelligence (AI) everywhere, many applications are expected to be equipped with powerful capabilities from AI. AI models used to generate images have become increasingly powerful, making it difficult for the human eye to distinguish between generated and real images. Generated images are even being used for criminal activities. Further, many news stories and content on the Internet may be generated in AI. Fake news and disinformation then become headache problems to manage. Therefore, effectively identifying whether an image is real has become imperative to maintain authenticity and trustworthiness in online platforms. Currently, common generative models include autoencoders(AEs), generative adversarial networks (GANs), and the latest diffusion models(DMs). While supervised learning detectors can effectively distinguish generated images, new generative AI models are getting better every day to enhance the quality of artificial images. Since it is impossible to determine which model generated an image in real-world applications, the generality of the detector is crucial. To address this problem, in this study, we utilize models based on the YCrCb and HSV color spaces to enhance confidence in classifying real images. Additionally, we test whether preprocessing methods, that have been effective for generative adversarial network (GAN) models, could still be effective for diffusion models. We also explore various filters and special threshold adjustment methods to improve the generalization capability of the AI-generated image detector. Our dataset was sourced from the Genimage dataset, which contains multiple images generated by diffusion models. Our results demonstrate a 10% enhancement on unseen subsets and the portability of our method for application to other base models.
中文摘要 . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . ii
Contents . . . . . . . . . . . . . . . . . . . . . .iv
List of Tables . . . . . . . . . . . . . .. . . . . vi
List of Figures . . . . . . . . . . . . . . . . . .viii
1 Introduction . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . .. . . . . .1
1.2 Background . . . . . . . . . . . . . .. . . . . .3
1.3 Related work . . . . . . . . . . . . . . . . . .5
1.4 Our Research Position and Contributions . . . . .7
1.5 Organization . . . . . . . . . . . . . . . . . . 8
2 Dataset Details . . . . . . . . . . . . . . . . . .9
3 Proposed Method . . . . . . . . . . . . .. . . . . 11
3.1 Problem Statement . . . . . . . . . . . . . . . .12
3.2 Problem Formulation . . . . . . . . . . . . . . 12
3.3 Combine Two Color Space Phase . . . . . . . . . .13
3.4 Effective Filter Phase . . . . . . . . . . . . . 16
3.5 Effective Adjusting Threshold Phase . . . . . . .17
4 Experiment & Evaluation . . . . . . . . . . . . . .20
4.1 Evaluation Metrics . . . . . . . . . . . . . . . 20
4.2 The Generalizability of Diffusion Models . . . . 22
4.3 Comparing ResNet50 and CvT . . . . . . . . . . . 27
4.4 Evaluating Color Space . . . . . . . . . . . . . 28
4.5 Evaluating Filters . . . . . . . . . . . . . . . 29
4.6 Evaluating Threshold . . . . . . . . . . . . . . 36
4.7 Instance Demonstration . . . . . . . . . . . . . 40
5 Conclusions . . . . . . . . . . .. . . . . . . . . 43
References . . . . . . . . . . .. . . . . . . . . . .45
[1] Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A.
Efros. Cnn-generated images are surprisingly easy to spot. . . for now. In 2020
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
pages 8692–8701, 2020.
[2] BuzzFeedVideo. https://www.youtube.com/watch?v=cq54gdm1el0. 2018.
[3] Boris Eldagsen. https://www.facebook.com/boriseldagsen/posts/pfbid02np79
nw1ndmrdazzea4qlmysnrendcnjmpqby22gq1fwfax2q7t6owg2gzm58zxwil. 2023.
[4] Abhinav Sagar. Generate high resolution images with generative variational autoencoder, 2021.
[5] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial
networks, 2014.
[6] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic
models, 2020.
[7] Luca Guarnera, Oliver Giudice, and Sebastiano Battiato. Level up the deepfake
detection: a method to effectively discriminate images generated by gan architectures and diffusion models, 2023.
[8] D. Gragnaniello, D. Cozzolino, F. Marra, G. Poggi, and L. Verdoliva. Are gan
generated images easy to detect? a critical analysis of the state-of-the-art. In
45
2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1–
6, 2021.
[9] C. Tan, Y. Zhao, S. Wei, G. Gu, and Y. Wei. Learning on gradients: Generalized
artifacts representation for gan-generated images detection. In 2023 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), pages 12105–
12114, Los Alamitos, CA, USA, jun 2023. IEEE Computer Society.
[10] Davide Alessandro Coccomini, Nicola Messina, Claudio Gennaro, and Fabrizio
Falchi. Combining EfficientNet and Vision Transformers for Video Deepfake Detection, page 219–229. Springer International Publishing, 2022.
[11] G. Monkam, W. Xu, and J. Yan. A gan-based approach to detect ai-generated
images. In 2023 26th ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing
(SNPD-Winter), pages 229–232, Los Alamitos, CA, USA, jul 2023. IEEE Computer Society.
[12] Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, and
Lei Zhang. Cvt: Introducing convolutions to vision transformers, 2021.
[13] Zhiyuan Yan, Yong Zhang, Yanbo Fan, and Baoyuan Wu. Ucf: Uncovering common features for generalizable deepfake detection, 2023.
[14] Yan Ju, Shan Jia, Lipeng Ke, Hongfei Xue, Koki Nagano, and Siwei Lyu. Fusing
global and local features for generalized ai-synthesized image detection. In 2022
IEEE International Conference on Image Processing (ICIP), pages 3465–3469,
2022.
46
[15] Di Yang, Yihao Huang, Qing Guo, Felix Juefei-Xu, Xiaojun Jia, Run Wang,
Geguang Pu, and Yang Liu. Text modality oriented image feature extraction
for detecting diffusion-based deepfake, 2024.
[16] Zhiyuan He, Pin-Yu Chen, and Tsung-Yi Ho. Rigid: A training-free and modelagnostic framework for robust ai-generated image detection, 2024.
[17] Mingjian Zhu, Hanting Chen, Qiangyu Yan, Xudong Huang, Guanyu Lin, Wei
Li, Zhijun Tu, Hailin Hu, Jie Hu, and Yunhe Wang. Genimage: A million-scale
benchmark for detecting ai-generated image, 2023.
[18] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet:
A large-scale hierarchical image database. In 2009 IEEE conference on computer
vision and pattern recognition, pages 248–255. Ieee, 2009.
[19] Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image
synthesis. Advances in Neural Information Processing Systems, 34:8780–8794,
2021.
[20] Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for
high fidelity natural image synthesis, 2019.
[21] Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin,
Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic
image generation and editing with text-guided diffusion models. arXiv preprint
arXiv:2112.10741, 2021.
[22] Midjourney. https://www.midjourney.com/home/. 2022.
47
[23] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn
Ommer. High-resolution image synthesis with latent diffusion models, 2022.
[24] Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen,
Lu Yuan, and Baining Guo. Vector quantized diffusion model for text-to-image
synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, pages 10696–10706, 2022.
[25] Wukong. https://xihe.mindspore.cn/modelzoo/wukong. 2022.
[26] Jianyu Xiao, Shancang Li, and Qingliang Xu. Video-based evidence analysis and
extraction in digital forensic investigation. IEEE Access, 7:55432–55442, 2019.
[27] Jeffrey A. Clark. https://pillow.readthedocs.io/en/stable/index.html. 2010.
[28] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
for image recognition, 2015.
[29] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet:
A large-scale hierarchical image database. In 2009 IEEE Conference on Computer
Vision and Pattern Recognition, pages 248–255, 2009.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top