臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.182) 您好！臺灣時間：2025/10/10 23:43

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

許書宇

研究生(外文):

Shu-Yu Hsu

論文名稱:

半監督對抗式生成網絡實現多場域影像轉譯

論文名稱(外文):

SemiStarGAN: Semi-Supervised Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

指導教授:

許永真

口試委員:

李宏毅、張智星、廖弘源、徐宏民

口試日期:

2018-06-06

學位類別:

碩士

校院名稱:

國立臺灣大學

系所名稱:

資訊工程學研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2018

畢業學年度:

106

語文別:

英文

論文頁數:

中文關鍵詞:

對抗式生成網絡、半監督式學習、多場域影像轉譯

相關次數:

被引用:0
點閱:224
評分:
下載:0
書目收藏:0

多場域影像轉譯 (multi-domain image-to-image translation) 是將影像由一個場域(domain)轉譯到其他多個場域的研究。近年來，許多影像轉譯的研究已經能夠利用生成方式對抗網路(generative adversarial network)的方法，從具有場域標記的資料中，學習場域之間的關係，建立複雜的生成模型。然而，這類型的演算法的學習成效仰賴於大量的標記資料，所以建構這樣的模型需要花費很高的時間與成本。

為了降低成本，本論文提出 SemiStarGAN，結合兩個半監督式學習技術: self ensembling 與 pseudo labeling，並提出名為 Y model 的新網絡參數共享方式, 將網絡中的判別器(discriminator) 與輔助分類器(auxiliary classifier) 的參數部分共享，以提升輔助分類器的泛化能力及穩定性。

本論文設計了人臉特徵轉譯的實驗，比較 StarGAN 與 SemiStarGAN 在不同標記資料量下的生成表現。實驗結果證實了我們所提出來的方法，僅需較少的標記資料，即可達到與 StarGAN 同等的轉譯效果。

Recent studies have shown significant advance for multi-domain image-to-image translation, and generative adversarial networks (GANs) are widely used to address this problem. However, existing methods all require a large number of domain-labeled images to train an effective image generator, but it may take time and effort to collect a large number of labeled data for real-world problems. In this thesis, we propose SemiStarGAN, a semi-supervised GAN network to tackle this issue. The proposed method utilizes unlabeled images by incorporating a novel discriminator/classifier network architecture Y model, and two existing semi-supervised learning techniques---pseudo labeling and self-ensembling. Experimental results on the CelebA dataset using domains of facial attributes show that the proposed method achieves comparable performance with state-of-the-art methods using considerably less labeled training images.

誌謝 iii
摘要 v
Abstract vii
1 Introduction 1
1.1 BackgroundandMotivation ........................ 1
1.2 ResearchObjective ............................. 2
1.3 ThesisOrganization............................. 3
2 Literature Review 5
2.1 GenerativeAdversarialNetwork ...................... 5
2.2 Image-to-ImageTranslation ........................ 6
2.2.1 GAN Based Paired Image-to-Image Translation . . . . . . . . . . 7
2.2.2 GAN Based Unpaired Image-to-Image Translation . . . . . . . . 7
2.3 Semi-SupervisedLearning ......................... 8
2.3.1 Desinging Distinctive Network Architectures . . . . . . . . . . . 9
2.3.2 Regularization and Data Augmentation Based Approaches . . . . 9
2.3.3 Semi-Supervised and Generative Adversarial Network . . . . . . 10
3 Semi-Supervised Multi-Domain Image-to-Image Translation 13
3.1 ProblemDefinition ............................. 13
3.2 SymbolTable................................ 14
3.3 ProposedMethod .............................. 16
3.3.1 GANObjective........................... 16
3.3.2 Domain Classification Loss and Self-Ensembling . . . . . . . . . 17
3.3.3 Cycle Consistency and Pseudo Cycle Consistency Loss . . . . . . 18
3.3.4 Y Model: Splitting Classifier and Discriminator. . . . . . . . . . 19
3.3.5 FullObjective............................ 21
3.3.6 Network Architecture and Implementation . . . . . . . . . . . . 22
4 Experiments 25
4.1 Experimental Setup............................. 25
4.1.1 DataSets .............................. 25
4.1.2 EvaluationMetrics ......................... 27
4.2 TrainingDetail ............................... 28
4.3 Experimental Results ............................ 29
4.3.1 Experimentonthreedomainsofhaircolors. . . . . . . . . . . . . 29
4.3.2 Experiment on 12 domains of hair colors, age, and gender. . . . . 35
4.4 TheEffectivenessofNetworkArchitecture . . . . . . . . . . . . . . . . 42
4.4.1 TheEffectivenessoftheYModel.................. 42
4.4.2 TheArchitectureoftheDiscriminator . . . . . . . . . . . . . . . 43
5 Conclusion 47
5.1 SummaryandContribution......................... 47
5.2 Restrictions ................................. 48
5.3 FutureStudies................................ 48
Bibliography 49

[1] O. Chapelle, B. Schölkopf, and A. Zien. Semi-Supervised Learning. 2006.
[2] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo. StarGAN: Unified gener- ative adversarial networks for multi-domain image-to-image translation. In CVPR, 2018.
[3]Z.Dai,Z.Yang,F.Yang,W.W.Cohen,andR.Salakhutdinov.GoodSemi-supervised Learning that Requires a Bad GAN. In NIPS. 2017.
[4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
[5] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In CVPR, June 2016.
[6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS. 2014.
[7] Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization. In NIPS. 2005.
[8] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of wasserstein gans. In NIPS, 2017.
[9] G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Comput., 18(7):1527–1554, July 2006.
[10] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.
[11] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with con- ditional adversarial networks. In CVPR, 2017.
[12] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.
[13] W.-C. Kang, C. Fang, Z. Wang, and J. McAuley. Visually-Aware Fashion Recom- mendation and Design with Generative Image Models. 2017.
[14] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In ICML, 2017.
[15] D. P. Kingma and J. L. Ba. Adam: A method for stochastic optimization. In ICML, 2015.
[16] D. P. Kingma and M. Welling. Auto-Encoding Variational Bayes. In ICLR, 2014.
[17] A. Krause, P. Perona, and R. G. Gomes. Discriminative clustering by regularized
information maximization. In NIPS. 2010.
[18] S. Laine and T. Aila. Temporal ensembling for semi-supervised learning. In ICLR,
2017.
[19] D.-H. Lee. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on Challenges in Representation Learn- ing,ICML, 2013.
[20] C. Li and M. Wand. Precomputed real-time texture synthesis with markovian gen- erative adversarial networks. In ECCV, 2016.
[21] C. Li, K. Xu, J. Zhu, and B. Zhang. Triple generative adversarial nets. In NIPS. 2017.
[22] M.-Y. Liu, T. Breuel, and J. Kautz. Unsupervised image-to-image translation net- works. In NIPS. 2017.
[23] M.-Y. Liu and O. Tuzel. Coupled generative adversarial networks. In NIPS. 2016.
[24] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In
ICCV, 2015.
[25] M. Lucic, K. Kurach, M. Michalski, S. Gelly, and O. Bousquet. Are GANs Created
Equal? A Large-Scale Study. ArXiv e-prints, Nov. 2017.
[26] M. Mirza and S. Osindero. Conditional generative adversarial nets. ArXiv e-prints,
2014.
[27] T. Miyato, S.-i. Maeda, M. Koyama, and S. Ishii. Virtual Adversarial Training: a Regularization Method for Supervised and Semi-supervised Learning. ArXiv e- prints, Apr. 2017.
[28] A. Odena. Semi-supervised learning with generative adversarial networks. In work- shop at ICML, 2016.
[29] A. Rasmus, M. Berglund, M. Honkala, H. Valpola, and T. Raiko. Semi-supervised learning with ladder networks. In NIPS. 2015.
[30] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. In ICML, 2016.
[31] O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI, 2015.
[32] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training gans. In NIPS, 2016.
[33] Y.-S. Shih, K.-Y. Chang, H.-T. Lin, and M. Sun. Compatibility family learning for item recommendation and generation. In AAAI, 2018.
[34] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[35] J. T. Springenberg. Unsupervised and semi-supervised learning with categorical generative adversarial networks. In ICLR, 2016.
[36]C.Szegedy,V.Vanhoucke,S.Ioffe,J.Shlens,andZ.Wojna.Rethinkingtheinception architecture for computer vision. In CVPR, 2016.
[37] Y. Taigman, A. Polyak, and L. Wolf. Unsupervised cross-domain image generation. In ICLR, 2017.
[38] D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky. Texture networks: Feed- forward synthesis of textures and stylized images. In ICML, 2016.
[39] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Instance normalization: The missing ingredient for fast stylization. ArXiv e-prints, July 2016.
[40] Z. Yi, H. Zhang, P. Tan, and M. Gong. DualGAN: Unsupervised dual learning for image-to-image translation. In ICCV, 2017.
[41] M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In ECCV, 2014.
[42] H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas. StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks. In ICCV, 2017.
[43] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networkss. In ICCV, 2017.
[44] S. Zhu, S. Fidler, R. Urtasun, D. Lin, and C. L. Chen. Be your own prada: Fashion synthesis with structural coherence. In ICCV, 2017.

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	命名實體過濾器使用於穩健的機器閱讀理解
2.	使用多任務網路串聯於個人化臉部動作單元偵測之研究
3.	基於字與詞混合方法之抽象摘要研究
4.	利用語法結構之雙向遞迴類神經網路於命名實體辨識之研究
5.	解決常識知識庫中基於句型模板的知識獲取法所遭遇的關係歧義之研究
6.	TransMORE: 基於翻譯的多語言開放式關係抽取
7.	以情緒導引強化注意力機制於情緒分析之研究
8.	以多元特徵方法優化邊際設備上的影片描述
9.	蒙特卡羅方法之罕見事件及其應用於電路模擬
10.	深度學習於多模態情感辨別——注意力截斷循環神經網絡
11.	科技輔助社會支持：珍藏長者生命故事
12.	運用單一慣性測量單元於深蹲動作之力竭預測
13.	透過序列到序列模型翻譯多模態情感
14.	具風格與結構意識的中文書法生成器
15.	以語言模型學習預測藥物可能關聯

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室