(3.238.250.105) 您好!臺灣時間:2021/04/20 04:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:Jonathan Hans Soeseno
研究生(外文):Jonathan Hans Soeseno
論文名稱:Controllable and Identity-Aware Facial Attribute Transformation
論文名稱(外文):Controllable and Identity-Aware Facial Attribute Transformation
指導教授:花凱龍
指導教授(外文):Kai-Lung Hua
口試委員:Conrado D. Ruiz, Jr.鍾國亮賴祐吉郭景明
口試委員(外文):Conrado D. Ruiz, Jr.Kuo-Liang ChungYu-Chi LaiJing-Ming Guo
口試日期:2018-12-18
學位類別:碩士
校院名稱:國立臺灣科技大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:107
語文別:英文
論文頁數:57
中文關鍵詞:Image to Image TranslationDeep LearningGenerative Adversarial NetworkIdentity AwareControllable Transformation
外文關鍵詞:Image to Image TranslationDeep LearningGenerative Adversarial NetworkIdentity AwareControllable Transformation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:76
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
Modifying facial attributes without paired dataset proves to be a challenging task. Previous approaches either require supervision from a ground truth transformed image or require training a separate model for mapping every pair of attributes. These limit the scalability of the models to accommodate a larger set of attributes since the number of models that we need to train grows exponentially large. Another major drawback of previous approaches is unintentionally changing the identity of the person as they transform the facial attributes. We propose a method that allows for controllable and identity aware transformations across multiple facial attributes using only a single model. Our approach is to train a generative adversarial network (GAN) with a multi-task conditional discriminator that recognizes the identity of the face, distinguishes real images from fake, as well as identifies facial attributes present in an image. This guides the generator into producing an output that is realistic while preserving the person’ s identity and facial attributes. Through this framework, our model also learns meaningful image representations in a lower dimensional latent space and semantically associate separate parts of the encoded vector with both the person’ s identity and facial attributes. This opens up the possibility of generating new faces and other dataset augmentation processes.
Modifying facial attributes without paired dataset proves to be a challenging task. Previous approaches either require supervision from a ground truth transformed image or require training a separate model for mapping every pair of attributes. These limit the scalability of the models to accommodate a larger set of attributes since the number of models that we need to train grows exponentially large. Another major drawback of previous approaches is unintentionally changing the identity of the person as they transform the facial attributes. We propose a method that allows for controllable and identity aware transformations across multiple facial attributes using only a single model. Our approach is to train a generative adversarial network (GAN) with a multi-task conditional discriminator that recognizes the identity of the face, distinguishes real images from fake, as well as identifies facial attributes present in an image. This guides the generator into producing an output that is realistic while preserving the person’ s identity and facial attributes. Through this framework, our model also learns meaningful image representations in a lower dimensional latent space and semantically associate separate parts of the encoded vector with both the person’ s identity and facial attributes. This opens up the possibility of generating new faces and other dataset augmentation processes.
Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.0.1 Problem Formulation . . . . . . . . . . . . . . . . 11
3.0.2 Network Architecture . . . . . . . . . . . . . . . . 12
3.0.3 Multi-task Discriminator . . . . . . . . . . . . . . 16
3.0.4 Generator . . . . . . . . . . . . . . . . . . . . . . 18
4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 23
4.0.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . 23
v4.0.2 Implementation Details . . . . . . . . . . . . . . . 24
4.0.3 Ablation Studies . . . . . . . . . . . . . . . . . . 25
4.0.4 Exploring the encoded space . . . . . . . . . . . . 28
4.0.5 Comparison to previous work . . . . . . . . . . . 32
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
[1] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycleconsistent adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), IEEE, October 2017.
[2] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “Stargan: Unified generative adversarial networks for multi-domain image-to-image translation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, June 2018.
[3] B. Amos, B. Ludwiczuk, and M. Satyanarayanan, “Openface: A general-purpose face recognition library with mobile applications,” tech. rep., CMU-CS-16-118, CMU School of Computer Science, June 2016.
[4] G. Antipov, M. Baccouche, and J.-L. Dugelay, “Face aging with conditional generative adversarial networks,” arXiv preprint arXiv:1702.01983, February 2017.
[5] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, July 2017.
[6] T. Kim, B. Kim, M. Cha, and J. Kim, “Unsupervised visual attribute transfer with reconfigurable generative adversarial networks,” arXiv preprint arXiv:1707.09798, July 2017.
[7] M.-Y. Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation networks,” in Advances in Neural Information Processing Systems 30 (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), Curran Associates, Inc., December 2017.
[8] G. Perarnau, J. van de Weijer, B. Raducanu, and J. M. Álvarez, “Invertible Conditional GANs for image editing,” in NIPS Workshop on Adversarial Training, Curran Associates, Inc., December 2016.
[9] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutionalgenerative adversarial networks,” in Proceedings of the International Conference on Learning Representations (ICLR), May 2015.
[10] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proceedings of the International Conference on Computer Vision (ICCV), IEEE, December 2015.
[11] E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley, “Color transfer between images,” in IEEE Computer Graphics and Applications, IEEE, October 2001.
[12] A. Levin, D. Lischinski, and Y. Weiss, “Colorization using optimization,” in ACM Transactions on Graphics (TOG), ACM, August 2004.
[13] D. Guo and T. Sim, “Digital face makeup by example,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, June 2009.
[14] L. Liu, J. Xing, S. Liu, H. Xu, X. Zhou, and S. Yan, “Wow! you are so beautiful today!,” in ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), ACM, October 2014.
[15] W.-S. Tong, C.-K. Tang, M. S. Brown, and Y.-Q. Xu, “Example-based cosmetic transfer,” in ComputerGraphics and Applications, 2007. PG’07. 15th Pacific Conference on, IEEE, December 2007.
[16] F. Yang, J. Wang, E. Shechtman, L. Bourdev, and D. Metaxas, “Expression flow for 3d-aware face component transfer,” in ACM Transactions on Graphics (TOG), ACM, August 2011.
[17] X. Yin and X. Liu, “Multi-task convolutional neural network for pose-invariant face recognition,” in IEEE Transactions on Image Processing, IEEE, October 2017.
[18] Y. Zhang, W. Dong, C. Ma, X. Mei, K. Li, F. Huang, B.-G. Hu, and O. Deussen, “Data-driven synthesis of cartoon faces using different styles,” in IEEE Transactions on Image Processing, IEEE, January 2017.
[19] D. Zhang, L. Lin, T. Chen, X. Wu, W. Tan, and E. Izquierdo, “Content-adaptive sketch portrait generation by decompositional representation learning,” in IEEE Transactions on Image Processing, IEEE, January 2017.
[20] M. Zhang, J. Li, N. Wang, and X. Gao, “Compositional model-based sketch generator in facial entertainment,” in IEEE Transactions on Cybernetics, IEEE, March 2018.
[21] X. Chen, X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “Infogan: Interpretable representation learning by information maximizing generative adversarial nets,” in Advances in Neural Information Processing Systems 29, Curran Associates, Inc., December 2016.
[22] K. Gregor, I. Danihelka, A. Graves, D. Rezende, and D. Wierstra, “Draw: A recurrent neural network for image generation,” in Proceedings of the 32nd International Conference on Machine Learning, PMLR, July 2015.
[23] L. Tran, X. Yin, and X. Liu, “Disentangled representation learning gan for pose-invariant face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, July 2017.
[24] M. Ochs, E. Diday, and F. Afonso, “From the symbolic analysis of virtual faces to a smiles machine,” in IEEE Transactions on Cybernetics, IEEE, February 2016.
[25] Y. Lin, J. Chen, Y. Cao, Y. Zhou, L. Zhang, Y. Y. Tang, and S. Wang, “Cross-domain recognition by identifying joint subspaces of source domain and target domain,” in IEEE Transactions on Cybernetics, IEEE, April 2017.
[26] S. C. Hidayati, C.-W. You, W.-H. Cheng, and K.-L. Hua, “Learning and recognition of clothing genres from full-body images,” in IEEE Transactions on Cybernetics, IEEE, May 2018.
[27] K.-H. Lo, Y.-C. F. Wang, and K.-L. Hua, “Edge-preserving depth map upsampling by joint trilateral filter,” in IEEE Transactions on Cybernetics, IEEE, January 2018.
[28] K. Zeng, J. Yu, R. Wang, C. Li, and D. Tao, “Coupled deep autoencoder for single image superresolution,” in IEEE Transactions on Cybernetics, IEEE, January 2017.
[29] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, Curran Associates, Inc., December 2014.
[30] D. J. Rezende, S. Mohamed, and D. Wierstra, “Stochastic backpropagation and approximate inference in deep generative models,” in Proceedings of the 31st International Conference on Machine Learning, PMLR, June 2014.
[31] J.-Y. Zhu, P. Krähenbühl, E. Shechtman, and A. A. Efros, “Generative visual manipulation on the natural image manifold,” in Proceedings of European Conference on Computer Vision (ECCV), Springer Science, October 2016.
[32] M. F. Mathieu, J. J. Zhao, J. Zhao, A. Ramesh, P. Sprechmann, and Y. LeCun, “Disentangling factors of variation in deep representation using adversarial training,” in Advances in Neural Information Processing Systems 29 (D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. arnett, eds.), Curran Associates, Inc., December 2016.
[33] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, and X. Chen, “Improved techniques for training gans,” in Advances in Neural Information Processing Systems 29, Curran Associates, Inc., December 2016.
[34] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, November 2014.
[35] A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier gans,” arXiv preprint arXiv:1610.09585, October 2016.
[36] X. Huang, Y. Li, O. Poursaeed, J. Hopcroft, and S. Belongie, “Stacked generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, July 2017.
[37] A. Sage, E. Agustsson, R. Timofte, and L. Van Gool, “Logo synthesis and manipulation with clustered generative adversarial networks,” arXiv preprint arXiv:1712.04407, December 2017.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔