跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.59) 您好!臺灣時間:2025/10/16 07:04
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:柯維然
研究生(外文):Ko, Wei-Jan
論文名稱:學習式先驗正規化自編碼器
論文名稱(外文):Learnable Prior Regularized Autoencoder
指導教授:孫春在孫春在引用關係
指導教授(外文):Sun, Chuen-Tsai
口試委員:孫春在彭文孝胡毓志
口試委員(外文):Sun, Chuen-TsaiPeng, Wen-HsiaoHu, Yuh-Jyh
口試日期:2018-07-13
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊學院資訊學程
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:中文
論文頁數:45
中文關鍵詞:深度學習電腦視覺深度生成模型
外文關鍵詞:deep learningcomputer visiongenerative adversarial networks
相關次數:
  • 被引用被引用:0
  • 點閱點閱:657
  • 評分評分:
  • 下載下載:12
  • 收藏至我的研究室書目清單書目收藏:0
隱變量深度生成模型 (Deep latent generative model) 大多需要選擇簡單、 可解析 (tractable) 機率分佈,來做為隱變量先驗假設,但從最近研究發現, 選擇不同的先驗分佈可能會影響生成模型能力;本研究提出一種從資料中 學習隱變量先驗分佈的方法,無須人為給定選擇特定的先驗分佈,並用於 先驗正規化自編碼器 (Prior Regularied Autoencoder),引入一個編碼生成網路 (Code generator) 來學習先驗分佈,可以更好的捕捉資料特性,最後提出一個 訓練框架,聯合訓練生成模型與先驗分佈;從實驗上來看,本研究提出方法 能有效提高生成樣本品質,並且於表徵學習任務、文字至影像的轉換任務上 都有良好的表現。
Most deep latent factor models choose simple priors for simplicity, tractability or not knowing what prior to use. Recent studies show that the choice of the prior may have a profound effect on the expressiveness of the model, especially when its generative network has limited capacity. In this paper, we propose to learn a proper prior from data for Prior Regularied Autoencoder. We introduce the notion of code generators to transform manually selected simple priors into ones that can better characterize the data distribution. Experimental results show that the pro- posed model can generate better image quality and learn better disentangled rep- resentations than AAEs in both supervised and unsupervised settings. Lastly, we present its ability to do cross-domain translation in a text-to-image synthesis task.
中文論文口試委員會審定書 ............................... i
ThesisCertificate ...................................... ii
論文著作權授權書..................................... iii
摘要............................................. iv
Abstract........................................... v
誌謝............................................. vi
目錄............................................. vii
圖目錄 ........................................... x
表目錄 ........................................... xii
第一章 緒論 ....................................... 1
1.1 研究動機..................................... 1
1.2 研究背景..................................... 6
1.2.1 估計機率分佈 .............................. 6
1.2.2 先驗正規化自編碼器 (Prior Regularized Autoencoder) . . . . . . . . 8
1.2.3 表徵學習(RepresentationLearning) .................. 8
1.2.4 條件生成與應用............................. 9
1.3 研究問題..................................... 10
1.4 研究重要性.................................... 11
第二章 文獻探討..................................... 12 2.1 隱變量模型(Latentvariablemodel)....................... 12
2.1.1 隱變量(Latentvariables) ........................ 12
2.1.2 深度隱變量模型(Deeplatentvariablemodel) . . . . . . . . . . . . . 12
2.1.3 計算後驗機率的困難(Intractableposterior) . . . . . . . . . . . . . . 14
2.1.4 Evidencelowerbound .......................... 14
2.2 變分自編碼器(VariationalAutoencoder) .................... 16
2.3 對抗生成網路(GenerativeAdversarialNetwork) . . . . . . . . . . . . . . . . 17
2.4 對抗自編碼器(AdversarialAutoencoder).................... 18
2.5 相關文獻彙整與討論 .............................. 19
2.5.1 VAEwithaVampPrior.......................... 19
2.5.2 Non-parametricVariationalAutoencoders . . . . . . . . . . . . . . . . 20
2.5.3 VariationalLossyAutoencoder...................... 21
第三章 研究方法..................................... 22
3.1 模型架構..................................... 22
3.1.1 編碼生成網路 .............................. 23
3.1.2 先驗正規化自編碼器 .......................... 23
3.2 訓練過程..................................... 23
3.2.1 訓練先驗正規化自編碼器 ....................... 24
3.2.2 訓練編碼生成網路 ........................... 24
3.2.3 兩階段交替訓練過程 .......................... 26
3.3 條件訓練框架 .................................. 26
3.3.1 條件編碼網路非監督式學習 ...................... 27
3.3.2 監督式學習 ............................... 28
3.4 模型評估方法 .................................. 28
第四章 實驗結果與討論................................. 30
4.1 實驗環境與設定................................. 30
4.1.1 實驗環境................................. 30
4.1.2 訓練資料集 ............................... 31
4.1.3 模型架構與超參數設定......................... 31
4.2 影像生成..................................... 32
4.3 表徵解構(DisentangledRepresentation)..................... 35
4.3.1 非監督式學習 .............................. 35
4.3.2 監督式學習 ............................... 36
4.4 文字到影像生成................................. 37
第五章 結論 ....................................... 40
5.1 結果討論..................................... 40
5.2 未來展望..................................... 40
參考文獻.......................................... 41
[1] Noah D Goodman and Joshua B. Tenenbaum. Probabilistic Models of Cognition. http: //probmods.org/v2, 2016. Accessed: 2018-3-16.
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. ArXiv e-prints, December 2015.
[3] Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[4] Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine, 2012.
[5] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostro- vski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dhar- shan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, February 2015.
[6] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catan- zaro. High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585, 2017.
[7] Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals,Alexander Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. 2016.
[8] Ian J. Goodfellow. NIPS 2016 tutorial: Generative adversarial networks. CoRR, abs/1701.00160, 2017.
[9] Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. Conditional image generation with pixelcnn decoders. CoRR, abs/1606.05328, 2016.
[10] D.PKingmaandM.Welling.Auto-EncodingVariationalBayes.ArXive-prints,December 2013.
[11] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian J. Goodfellow. Adversarial autoencoders. CoRR, abs/1511.05644, 2015.
[12] Y. Burda, R. Grosse, and R. Salakhutdinov. Importance Weighted Autoencoders. ArXiv e-prints, September 2015.
[13] N. Dilokthanakul, P. A. M. Mediano, M. Garnelo, M. C. H. Lee, H. Salimbeni, K. Arulku- maran, and M. Shanahan. Deep Unsupervised Clustering with Gaussian Mixture Varia- tional Autoencoders. ArXiv e-prints, November 2016.
[14] Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T Freeman, and Joshua B Tenenbaum. MarrNet: 3D Shape Reconstruction via 2.5D Sketches. In Advances In Neural Information Processing Systems, 2017.
[15] S.R.Bowman,L.Vilnis,O.Vinyals,A.M.Dai,R.Jozefowicz,andS.Bengio.Generating Sentences from a Continuous Space. ArXiv e-prints, November 2015.
[16] Yoshua Bengio, Aaron C. Courville, and Pascal Vincent. Unsupervised feature learning and deep learning: A review and new perspectives. CoRR, abs/1206.5538, 2012.
[17] Thomas Schlegl, Philipp Seeböck, Sebastian M. Waldstein, Ursula Schmidt-Erfurth, and Georg Langs. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. CoRR, abs/1703.05921, 2017.
[18] Diederik P. Kingma, Danilo Jimenez Rezende, Shakir Mohamed, and Max Welling. Semi- supervised learning with deep generative models. CoRR, abs/1406.5298, 2014.
[19] David Ha and Douglas Eck. A neural representation of sketch drawings. CoRR, abs/1704.03477, 2017.
[20] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. CoRR, abs/1411.1784, 2014.
[21] PhillipIsola,Jun-YanZhu,TinghuiZhou,andAlexeiA.Efros.Image-to-imagetranslation with conditional adversarial networks. CoRR, abs/1611.07004, 2016.
[22] Yingce Xia, Di He, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. Dual learning for machine translation. CoRR, abs/1611.00179, 2016.
[23] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. CoRR, abs/1703.10593, 2017.
[24] Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of European Conference on Computer Vision (ECCV), 2016.
[25] Aäron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Ko- ray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, and Demis Hassabis. Parallel wavenet: Fast high-fidelity speech synthesis. CoRR, abs/1711.10433, 2017.
[26] David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. Variational inference: A review for statisticians. CoRR, abs/1601.00670, 2016.
[27] IanGoodfellow,JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahra- mani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672–2680. Curran Associates, Inc., 2014.
[28] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. ArXiv e-prints, January 2017.
[29] G.-J. Qi. Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities. ArXiv
e-prints, January 2017.
[30] Scott E. Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text to image synthesis. CoRR, abs/1605.05396, 2016.
[31] Jakub M. Tomczak and Max Welling. VAE with a vampprior. CoRR, abs/1705.07120, 2017.
[32] Prasoon Goyal, Zhiting Hu, Xiaodan Liang, Chenyu Wang, and Eric P. Xing. Non- parametric variational auto-encoders for hierarchical representation learning. CoRR, abs/1703.07027, 2017.
[33] Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. CoRR, abs/1606.03498, 2016.
[34] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Günter Klam- bauer, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a nash equilibrium. CoRR, abs/1706.08500, 2017.
[35] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wo- jna. Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567, 2015.
[36] Yann LeCun and Corinna Cortes. MNIST handwritten digit database. 2010.
[37] Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar-10 (canadian institute for ad- vanced research).
[38] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
[39] M-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008.
[40] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊