跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.182) 您好!臺灣時間:2025/10/10 23:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:許書宇
研究生(外文):Shu-Yu Hsu
論文名稱:半監督對抗式生成網絡實現多場域影像轉譯
論文名稱(外文):SemiStarGAN: Semi-Supervised Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
指導教授:許永真許永真引用關係
口試委員:李宏毅張智星廖弘源徐宏民
口試日期:2018-06-06
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:英文
論文頁數:52
中文關鍵詞:對抗式生成網絡半監督式學習多場域影像轉譯
相關次數:
  • 被引用被引用:0
  • 點閱點閱:224
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
多場域影像轉譯 (multi-domain image-to-image translation) 是將影像由一個場域(domain)轉譯到其他多個場域的研究。近年來,許多影像轉譯的研究已經能夠利用生成方式對抗網路(generative adversarial network)的方法,從具有場域標記的資料中,學習場域之間的關係,建立複雜的生成模型。然而,這類型的演算法的學習成效仰賴於大量的標記資料,所以建構這樣的模型需要花費很高的時間與成本。

為了降低成本,本論文提出 SemiStarGAN,結合兩個半監督式學習技術: self ensembling 與 pseudo labeling,並提出名為 Y model 的新網絡參數共享方式, 將網絡中的判別器(discriminator) 與輔助分類器(auxiliary classifier) 的參數部分共享,以提升輔助分類器的泛化能力及穩定性。

本論文設計了人臉特徵轉譯的實驗,比較 StarGAN 與 SemiStarGAN 在不同標記資料量下的生成表現。實驗結果證實了我們所提出來的方法,僅需較少的標記資料,即可達到與 StarGAN 同等的轉譯效果。
Recent studies have shown significant advance for multi-domain image-to-image translation, and generative adversarial networks (GANs) are widely used to address this problem. However, existing methods all require a large number of domain-labeled images to train an effective image generator, but it may take time and effort to collect a large number of labeled data for real-world problems. In this thesis, we propose SemiStarGAN, a semi-supervised GAN network to tackle this issue. The proposed method utilizes unlabeled images by incorporating a novel discriminator/classifier network architecture Y model, and two existing semi-supervised learning techniques---pseudo labeling and self-ensembling. Experimental results on the CelebA dataset using domains of facial attributes show that the proposed method achieves comparable performance with state-of-the-art methods using considerably less labeled training images.
誌謝 iii
摘要 v
Abstract vii
1 Introduction 1
1.1 BackgroundandMotivation ........................ 1
1.2 ResearchObjective ............................. 2
1.3 ThesisOrganization............................. 3
2 Literature Review 5
2.1 GenerativeAdversarialNetwork ...................... 5
2.2 Image-to-ImageTranslation ........................ 6
2.2.1 GAN Based Paired Image-to-Image Translation . . . . . . . . . . 7
2.2.2 GAN Based Unpaired Image-to-Image Translation . . . . . . . . 7
2.3 Semi-SupervisedLearning ......................... 8
2.3.1 Desinging Distinctive Network Architectures . . . . . . . . . . . 9
2.3.2 Regularization and Data Augmentation Based Approaches . . . . 9
2.3.3 Semi-Supervised and Generative Adversarial Network . . . . . . 10
3 Semi-Supervised Multi-Domain Image-to-Image Translation 13
3.1 ProblemDefinition ............................. 13
3.2 SymbolTable................................ 14
3.3 ProposedMethod .............................. 16
3.3.1 GANObjective........................... 16
3.3.2 Domain Classification Loss and Self-Ensembling . . . . . . . . . 17
3.3.3 Cycle Consistency and Pseudo Cycle Consistency Loss . . . . . . 18
3.3.4 Y Model: Splitting Classifier and Discriminator. . . . . . . . . . 19
3.3.5 FullObjective............................ 21
3.3.6 Network Architecture and Implementation . . . . . . . . . . . . 22
4 Experiments 25
4.1 Experimental Setup............................. 25
4.1.1 DataSets .............................. 25
4.1.2 EvaluationMetrics ......................... 27
4.2 TrainingDetail ............................... 28
4.3 Experimental Results ............................ 29
4.3.1 Experimentonthreedomainsofhaircolors. . . . . . . . . . . . . 29
4.3.2 Experiment on 12 domains of hair colors, age, and gender. . . . . 35
4.4 TheEffectivenessofNetworkArchitecture . . . . . . . . . . . . . . . . 42
4.4.1 TheEffectivenessoftheYModel.................. 42
4.4.2 TheArchitectureoftheDiscriminator . . . . . . . . . . . . . . . 43
5 Conclusion 47
5.1 SummaryandContribution......................... 47
5.2 Restrictions ................................. 48
5.3 FutureStudies................................ 48
Bibliography 49
[1] O. Chapelle, B. Schölkopf, and A. Zien. Semi-Supervised Learning. 2006.
[2] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo. StarGAN: Unified gener- ative adversarial networks for multi-domain image-to-image translation. In CVPR, 2018.
[3]Z.Dai,Z.Yang,F.Yang,W.W.Cohen,andR.Salakhutdinov.GoodSemi-supervised Learning that Requires a Bad GAN. In NIPS. 2017.
[4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
[5] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In CVPR, June 2016.
[6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS. 2014.
[7] Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization. In NIPS. 2005.
[8] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of wasserstein gans. In NIPS, 2017.
[9] G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Comput., 18(7):1527–1554, July 2006.
[10] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.
[11] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with con- ditional adversarial networks. In CVPR, 2017.
[12] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.
[13] W.-C. Kang, C. Fang, Z. Wang, and J. McAuley. Visually-Aware Fashion Recom- mendation and Design with Generative Image Models. 2017.
[14] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In ICML, 2017.
[15] D. P. Kingma and J. L. Ba. Adam: A method for stochastic optimization. In ICML, 2015.
[16] D. P. Kingma and M. Welling. Auto-Encoding Variational Bayes. In ICLR, 2014.
[17] A. Krause, P. Perona, and R. G. Gomes. Discriminative clustering by regularized
information maximization. In NIPS. 2010.
[18] S. Laine and T. Aila. Temporal ensembling for semi-supervised learning. In ICLR,
2017.
[19] D.-H. Lee. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on Challenges in Representation Learn- ing,ICML, 2013.
[20] C. Li and M. Wand. Precomputed real-time texture synthesis with markovian gen- erative adversarial networks. In ECCV, 2016.
[21] C. Li, K. Xu, J. Zhu, and B. Zhang. Triple generative adversarial nets. In NIPS. 2017.
[22] M.-Y. Liu, T. Breuel, and J. Kautz. Unsupervised image-to-image translation net- works. In NIPS. 2017.
[23] M.-Y. Liu and O. Tuzel. Coupled generative adversarial networks. In NIPS. 2016.
[24] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In
ICCV, 2015.
[25] M. Lucic, K. Kurach, M. Michalski, S. Gelly, and O. Bousquet. Are GANs Created
Equal? A Large-Scale Study. ArXiv e-prints, Nov. 2017.
[26] M. Mirza and S. Osindero. Conditional generative adversarial nets. ArXiv e-prints,
2014.
[27] T. Miyato, S.-i. Maeda, M. Koyama, and S. Ishii. Virtual Adversarial Training: a Regularization Method for Supervised and Semi-supervised Learning. ArXiv e- prints, Apr. 2017.
[28] A. Odena. Semi-supervised learning with generative adversarial networks. In work- shop at ICML, 2016.
[29] A. Rasmus, M. Berglund, M. Honkala, H. Valpola, and T. Raiko. Semi-supervised learning with ladder networks. In NIPS. 2015.
[30] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. In ICML, 2016.
[31] O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI, 2015.
[32] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training gans. In NIPS, 2016.
[33] Y.-S. Shih, K.-Y. Chang, H.-T. Lin, and M. Sun. Compatibility family learning for item recommendation and generation. In AAAI, 2018.
[34] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[35] J. T. Springenberg. Unsupervised and semi-supervised learning with categorical generative adversarial networks. In ICLR, 2016.
[36]C.Szegedy,V.Vanhoucke,S.Ioffe,J.Shlens,andZ.Wojna.Rethinkingtheinception architecture for computer vision. In CVPR, 2016.
[37] Y. Taigman, A. Polyak, and L. Wolf. Unsupervised cross-domain image generation. In ICLR, 2017.
[38] D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky. Texture networks: Feed- forward synthesis of textures and stylized images. In ICML, 2016.
[39] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Instance normalization: The missing ingredient for fast stylization. ArXiv e-prints, July 2016.
[40] Z. Yi, H. Zhang, P. Tan, and M. Gong. DualGAN: Unsupervised dual learning for image-to-image translation. In ICCV, 2017.
[41] M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In ECCV, 2014.
[42] H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas. StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks. In ICCV, 2017.
[43] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networkss. In ICCV, 2017.
[44] S. Zhu, S. Fidler, R. Urtasun, D. Lin, and C. L. Chen. Be your own prada: Fashion synthesis with structural coherence. In ICCV, 2017.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top