(54.236.58.220) 您好!臺灣時間:2021/02/28 09:28
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳彥名
研究生(外文):CHEN, YAN-MING
論文名稱:流域應用於隱空間探索與控制完成混合風格生成
論文名稱(外文):Domain Flow for Mixture Style Generation on Latent Space Exploration and Control
指導教授:江振國
指導教授(外文):Chiang, Chen-Kuo
口試委員:江振國朱威達黃敬群胡敏君
口試委員(外文):Chiang, Chen-KuoChu, Wei-TaHuang, Ching-ChunHu, Min-Chun
口試日期:2020-07-29
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:41
中文關鍵詞:生成對抗網路圖片表示方法分解隱空間探索
外文關鍵詞:Generative Adversarial NetworksImage Representation DisentanglementLatent Space Exploration
相關次數:
  • 被引用被引用:0
  • 點閱點閱:40
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在Image-to-image translation 的task 中達到巨大的進步。有方法可以完成用一個單一網路完成multiple domain image generation 或 透過 many-to-many learning 產生 diversity images。然而,能然有困難 去創造一張沒有target dateset當作參考的圖片。在這篇work,我們提出一個方法可以透過domain flowing 從 source domain 到 target domain 探索 latent space 並且透過 the intermediate latent feature 產生 mixture domain image。 我的模型有主要優勢是他可以透過兩個不同 domain 圖片 style feature 連續的產生出 intermediate images。首先,我們介紹我們的架構和學習的方法給 image representation disentangling。
我們的架構提供user用example images有效的控制 輸出結果的風格轉換。
The task of image-to-image translation has achieved signi ficant progress . There are the methods to generate multiple-domain image with single generative
network or apply many-to-many learning for diversity images generation. However,there is limitation to create a brand new image without a target dataset as reference. In this paper, we propose the method, which explores the latent space from source domain to target domain by domain flowing and apply the intermediate latent features to generate mixture domain images. The advantage of our model is that it is able to continuously produce intermediate images by the style features of two different domain images. We introduce our framework and the learning method for image representation disentangling and domain flowing for latent space exploration. Our framework provides users to e ffectively control the style translation of output with the example images.
Contents
1 Introduction 1
2 Related Work 4
2.1 Generative Adversarial Networks . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Image-to-Image Translation . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Disentangled representations . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Latent Space in Generative Models . . . . . . . . . . . . . . . . . . . . . . 5
3 Method 7
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Image Representation Disentangling . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Architecture - Disentangling . . . . . . . . . . . . . . . . . . . . . . 9
3.2.2 Domain Adversarial Loss . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.3 Image Reconstruction Loss . . . . . . . . . . . . . . . . . . . . . . 11
3.2.4 Latent Reconstruction Loss . . . . . . . . . . . . . . . . . . . . . . 12
3.2.5 Total Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Domain Flowing for Latent Space Exploration . . . . . . . . . . . . . . . . 12
3.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.2 Architecture - Domain Flowing . . . . . . . . . . . . . . . . . . . . 14
3.3.3 Flowiong Adversarial Loss . . . . . . . . . . . . . . . . . . . . . . . 15
3.3.4 Latent Reconstruction Loss . . . . . . . . . . . . . . . . . . . . . . 16
3.3.5 Total Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Implementation 17
5 Experiments 18
5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2.1 Frechet Inception Distance . . . . . . . . . . . . . . . . . . . . . . . 19
5.2.2 LPIPS Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.1 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.2 Qualitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 21
6 Conclusion 29

[1] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414–2423, 2016.
[2] Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen, “Attgan: Facial attribute editing by only changing what you want,”IEEE Transactions on Image Processing, vol. 28,no. 11, pp. 5464–5478, 2019.
[3] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440, 2015.
[4] R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in European conference on computer vision, pp. 649–666, Springer, 2016.
[5] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134, 2017.
[6] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translationusing cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, pp. 2223–2232, 2017.
[7] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim, “Learning to discover cross-domain relations with generative adversarial networks,” in Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1857–1865, 2017.
[8] M.-Y. Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation net-works,” in Advances in neural information processing systems, pp. 700–708, 2017.
[9] J.-Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, and E. Shechtman,“Toward multimodal image-to-image translation,” in Advances in neural information processing systems, pp. 465–476, 2017.
[10] X. Huang, M.-Y. Liu, S. Belongie, and J. Kautz, “Multimodal unsupervised image-to-image translation,” in Proceedings of the European Conference on Computer Vision(ECCV), pp. 172–189, 2018.
[11] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh, and M.-H. Yang, “Diverse image-to-image translation via disentangled representations,” in Proceedings of the European conference on computer vision (ECCV), pp. 35–51, 2018.
[12] G.-Y. Hao, H.-X. Yu, and W.-S. Zheng, “Mixgan: learning concepts from different domains for mixture generation,”arXiv preprint arXiv:1807.01659, 2018.[13] Y.-H. Tsai, W.-C. Hung, S. Schulter, K. Sohn, M.-H. Yang, and M. Chandraker,“Learning to adapt structured output space for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481, 2018.
[14] Y. Luo, L. Zheng, T. Guan, J. Yu, and Y. Yang, “Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 2507–2516, 2019.
[15] B. Sun, J. Feng, and K. Saenko, “Return of frustratingly easy domain adaptation,”in Thirtieth AAAI Conference on Artificial Intelligence, 2016.
[16] R. Gong, W. Li, Y. Chen, and L. V. Gool, “Dlow: Domain flow for adaptation and generalization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2477–2486, 2019.
[17] N. Yu, C. Barnes, E. Shechtman, S. Amirghodsi, and M. Lukac, “Texture mixer: Anetwork for controllable synthesis and interpolation of texture,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12164–12173,2019.
[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014.
[19] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,”arXiv preprintarXiv:1511.06434, 2015.
[20] X. Huang, Y. Li, O. Poursaeed, J. Hopcroft, and S. Belongie, “Stacked generative adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5077–5086, 2017.
[21] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, “Generative adversarial text to image synthesis,”arXiv preprint arXiv:1605.05396, 2016.
[22] S. Martin Arjovsky and L. Bottou, “Wasserstein generative adversarial networks,”inProceedings of the 34 th International Conference on Machine Learning, Sydney,Australia, 2017.
[23] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” in Advances in neural information processing systems,pp. 5767–5777, 2017.
[24] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley, “Least squares generative adversarial networks,” in Proceedings of the IEEE international conferenceon computer vision, pp. 2794–2802, 2017.
[25] A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier gans,” in International conference on machine learning, pp. 2642–2651, 2017.
[26] M. Mirza and S. Osindero, “Conditional generative adversarial nets,”arXiv preprintarXiv:1411.1784, 2014.
[27] M. Rosca, B. Lakshminarayanan, D. Warde-Farley, and S. Mohamed, “Variational approaches for auto-encoding generative adversarial networks,”arXiv preprintarXiv:1706.04987, 2017.
[28] A. Brock, J. Donahue, and K. Simonyan, “Large scale gan training for high fidelity natural image synthesis,”arXiv preprint arXiv:1809.11096, 2018.
[29] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “High-resolution image synthesis and semantic manipulation with conditional gans,” in Proceedings of the IEEE conference on computer vision and pattern recognition,pp. 8798–8807, 2018.
[30] Z. Yi, H. Zhang, P. Tan, and M. Gong, “Dualgan: Unsupervised dual learning for image-to-image translation,” in Proceedings of the IEEE international conference on computer vision, pp. 2849–2857, 2017.
[31] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “Stargan: Unified generative adversarial networks for multi-domain image-to-image translation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797, 2018.
[32] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “In-fogan: Interpretable representation learning by information maximizing generative adversarial nets,” in Advances in neural information processing systems, pp. 2172–2180, 2016.
[33] L. Tran, X. Yin, and X. Liu, “Disentangled representation learning gan for pose-invariant face recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1415–1424, 2017.
[34] L. Ma, Q. Sun, S. Georgoulis, L. Van Gool, B. Schiele, and M. Fritz, “Disentangled person image generation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 99–108, 2018.
[35] H. Kazemi, S. M. Iranmanesh, and N. Nasrabadi, “Style and content disentanglement in generative adversarial networks,” in 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 848–856, IEEE, 2019.[36] D. Kotovenko, A. Sanakoyeu, S. Lang, and B. Ommer, “Content and style disentanglement for artistic style transfer,” in Proceedings of the IEEE International Confer-ence on Computer Vision, pp. 4422–4431, 2019.
[37] S. Palsson, E. Agustsson, R. Timofte, and L. Van Gool, “Generative adversarial styletransfer networks for face aging,” in Proceedings of the IEEE conference on computervision and pattern recognition workshops, pp. 2084–2092, 2018.
[38] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4401–4410, 2019.
[39] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”in Proceedings of the IEEE conference on computer vision and pattern recognition,pp. 770–778, 2016.
[40] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932,2017.
[41] X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive in-stance normalization,” in Proceedings of the IEEE International Conference on Com-puter Vision, pp. 1501–1510, 2017.
[42] P.-W. Wu, Y.-J. Lin, C.-H. Chang, E. Y. Chang, and S.-W. Liao, “Relgan: Multi-domain image-to-image translation via relative attributes,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 5914–5922, 2019.
[43] D. Berthelot, C. Raffel, A. Roy, and I. Goodfellow, “Understanding and improving interpolation in autoencoders via an adversarial regularizer,”arXiv preprintarXiv:1807.07543, 2018.
[44] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”arXivpreprint arXiv:1412.6980, 2014.[45] W.-H. Yang and C.-K. Chiang, “Outside the box : New style discovering via generative adversarial network for shoes design,” 2019.[46] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, “Improved techniques for training gans,” inAdvances in neural information processing systems, pp. 2234–2242, 2016.[47] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trainedby a two time-scale update rule converge to a local nash equilibrium,” in Advances in neural information processing systems, pp. 6626–6637, 2017.
[48] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595, 2018.
電子全文 電子全文(網際網路公開日期:20250820)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔