臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.59) 您好！臺灣時間：2025/10/16 07:04

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

柯維然

研究生(外文):

Ko, Wei-Jan

論文名稱:

學習式先驗正規化自編碼器

論文名稱(外文):

Learnable Prior Regularized Autoencoder

指導教授:

孫春在

指導教授(外文):

Sun, Chuen-Tsai

口試委員:

孫春在、彭文孝、胡毓志

口試委員(外文):

Sun, Chuen-Tsai、Peng, Wen-Hsiao、Hu, Yuh-Jyh

口試日期:

2018-07-13

學位類別:

碩士

校院名稱:

國立交通大學

系所名稱:

資訊學院資訊學程

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2018

畢業學年度:

106

語文別:

中文

論文頁數:

中文關鍵詞:

深度學習、電腦視覺、深度生成模型

外文關鍵詞:

deep learning、computer vision、generative adversarial networks

相關次數:

被引用:0
點閱:657
評分:
下載:12
書目收藏:0

隱變量深度生成模型 (Deep latent generative model) 大多需要選擇簡單、可解析 (tractable) 機率分佈，來做為隱變量先驗假設，但從最近研究發現，選擇不同的先驗分佈可能會影響生成模型能力;本研究提出一種從資料中學習隱變量先驗分佈的方法，無須人為給定選擇特定的先驗分佈，並用於先驗正規化自編碼器 (Prior Regularied Autoencoder)，引入一個編碼生成網路 (Code generator) 來學習先驗分佈，可以更好的捕捉資料特性，最後提出一個訓練框架，聯合訓練生成模型與先驗分佈;從實驗上來看，本研究提出方法能有效提高生成樣本品質，並且於表徵學習任務、文字至影像的轉換任務上都有良好的表現。

Most deep latent factor models choose simple priors for simplicity, tractability or not knowing what prior to use. Recent studies show that the choice of the prior may have a profound effect on the expressiveness of the model, especially when its generative network has limited capacity. In this paper, we propose to learn a proper prior from data for Prior Regularied Autoencoder. We introduce the notion of code generators to transform manually selected simple priors into ones that can better characterize the data distribution. Experimental results show that the pro- posed model can generate better image quality and learn better disentangled rep- resentations than AAEs in both supervised and unsupervised settings. Lastly, we present its ability to do cross-domain translation in a text-to-image synthesis task.

中文論文口試委員會審定書 ............................... i
ThesisCertificate ...................................... ii
論文著作權授權書..................................... iii
摘要............................................. iv
Abstract........................................... v
誌謝............................................. vi
目錄............................................. vii
圖目錄 ........................................... x
表目錄 ........................................... xii
第一章緒論 ....................................... 1
1.1 研究動機..................................... 1
1.2 研究背景..................................... 6
1.2.1 估計機率分佈 .............................. 6
1.2.2 先驗正規化自編碼器 (Prior Regularized Autoencoder) . . . . . . . . 8
1.2.3 表徵學習(RepresentationLearning) .................. 8
1.2.4 條件生成與應用............................. 9
1.3 研究問題..................................... 10
1.4 研究重要性.................................... 11
第二章文獻探討..................................... 12 2.1 隱變量模型(Latentvariablemodel)....................... 12
2.1.1 隱變量(Latentvariables) ........................ 12
2.1.2 深度隱變量模型(Deeplatentvariablemodel) . . . . . . . . . . . . . 12
2.1.3 計算後驗機率的困難(Intractableposterior) . . . . . . . . . . . . . . 14
2.1.4 Evidencelowerbound .......................... 14
2.2 變分自編碼器(VariationalAutoencoder) .................... 16
2.3 對抗生成網路(GenerativeAdversarialNetwork) . . . . . . . . . . . . . . . . 17
2.4 對抗自編碼器(AdversarialAutoencoder).................... 18
2.5 相關文獻彙整與討論 .............................. 19
2.5.1 VAEwithaVampPrior.......................... 19
2.5.2 Non-parametricVariationalAutoencoders . . . . . . . . . . . . . . . . 20
2.5.3 VariationalLossyAutoencoder...................... 21
第三章研究方法..................................... 22
3.1 模型架構..................................... 22
3.1.1 編碼生成網路 .............................. 23
3.1.2 先驗正規化自編碼器 .......................... 23
3.2 訓練過程..................................... 23
3.2.1 訓練先驗正規化自編碼器 ....................... 24
3.2.2 訓練編碼生成網路 ........................... 24
3.2.3 兩階段交替訓練過程 .......................... 26
3.3 條件訓練框架 .................................. 26
3.3.1 條件編碼網路非監督式學習 ...................... 27
3.3.2 監督式學習 ............................... 28
3.4 模型評估方法 .................................. 28
第四章實驗結果與討論................................. 30
4.1 實驗環境與設定................................. 30
4.1.1 實驗環境................................. 30
4.1.2 訓練資料集 ............................... 31
4.1.3 模型架構與超參數設定......................... 31
4.2 影像生成..................................... 32
4.3 表徵解構(DisentangledRepresentation)..................... 35
4.3.1 非監督式學習 .............................. 35
4.3.2 監督式學習 ............................... 36
4.4 文字到影像生成................................. 37
第五章結論 ....................................... 40
5.1 結果討論..................................... 40
5.2 未來展望..................................... 40
參考文獻.......................................... 41

[1] Noah D Goodman and Joshua B. Tenenbaum. Probabilistic Models of Cognition. http: //probmods.org/v2, 2016. Accessed: 2018-3-16.
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. ArXiv e-prints, December 2015.
[3] Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[4] Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine, 2012.
[5] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostro- vski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dhar- shan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, February 2015.
[6] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catan- zaro. High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585, 2017.
[7] Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals,Alexander Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. 2016.
[8] Ian J. Goodfellow. NIPS 2016 tutorial: Generative adversarial networks. CoRR, abs/1701.00160, 2017.
[9] Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. Conditional image generation with pixelcnn decoders. CoRR, abs/1606.05328, 2016.
[10] D.PKingmaandM.Welling.Auto-EncodingVariationalBayes.ArXive-prints,December 2013.
[11] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian J. Goodfellow. Adversarial autoencoders. CoRR, abs/1511.05644, 2015.
[12] Y. Burda, R. Grosse, and R. Salakhutdinov. Importance Weighted Autoencoders. ArXiv e-prints, September 2015.
[13] N. Dilokthanakul, P. A. M. Mediano, M. Garnelo, M. C. H. Lee, H. Salimbeni, K. Arulku- maran, and M. Shanahan. Deep Unsupervised Clustering with Gaussian Mixture Varia- tional Autoencoders. ArXiv e-prints, November 2016.
[14] Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T Freeman, and Joshua B Tenenbaum. MarrNet: 3D Shape Reconstruction via 2.5D Sketches. In Advances In Neural Information Processing Systems, 2017.
[15] S.R.Bowman,L.Vilnis,O.Vinyals,A.M.Dai,R.Jozefowicz,andS.Bengio.Generating Sentences from a Continuous Space. ArXiv e-prints, November 2015.
[16] Yoshua Bengio, Aaron C. Courville, and Pascal Vincent. Unsupervised feature learning and deep learning: A review and new perspectives. CoRR, abs/1206.5538, 2012.
[17] Thomas Schlegl, Philipp Seeböck, Sebastian M. Waldstein, Ursula Schmidt-Erfurth, and Georg Langs. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. CoRR, abs/1703.05921, 2017.
[18] Diederik P. Kingma, Danilo Jimenez Rezende, Shakir Mohamed, and Max Welling. Semi- supervised learning with deep generative models. CoRR, abs/1406.5298, 2014.
[19] David Ha and Douglas Eck. A neural representation of sketch drawings. CoRR, abs/1704.03477, 2017.
[20] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. CoRR, abs/1411.1784, 2014.
[21] PhillipIsola,Jun-YanZhu,TinghuiZhou,andAlexeiA.Efros.Image-to-imagetranslation with conditional adversarial networks. CoRR, abs/1611.07004, 2016.
[22] Yingce Xia, Di He, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. Dual learning for machine translation. CoRR, abs/1611.00179, 2016.
[23] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. CoRR, abs/1703.10593, 2017.
[24] Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of European Conference on Computer Vision (ECCV), 2016.
[25] Aäron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Ko- ray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, and Demis Hassabis. Parallel wavenet: Fast high-fidelity speech synthesis. CoRR, abs/1711.10433, 2017.
[26] David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. Variational inference: A review for statisticians. CoRR, abs/1601.00670, 2016.
[27] IanGoodfellow,JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahra- mani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672–2680. Curran Associates, Inc., 2014.
[28] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. ArXiv e-prints, January 2017.
[29] G.-J. Qi. Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities. ArXiv
e-prints, January 2017.
[30] Scott E. Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text to image synthesis. CoRR, abs/1605.05396, 2016.
[31] Jakub M. Tomczak and Max Welling. VAE with a vampprior. CoRR, abs/1705.07120, 2017.
[32] Prasoon Goyal, Zhiting Hu, Xiaodan Liang, Chenyu Wang, and Eric P. Xing. Non- parametric variational auto-encoders for hierarchical representation learning. CoRR, abs/1703.07027, 2017.
[33] Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. CoRR, abs/1606.03498, 2016.
[34] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Günter Klam- bauer, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a nash equilibrium. CoRR, abs/1706.08500, 2017.
[35] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wo- jna. Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567, 2015.
[36] Yann LeCun and Corinna Cortes. MNIST handwritten digit database. 2010.
[37] Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar-10 (canadian institute for ad- vanced research).
[38] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
[39] M-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008.
[40] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014.

電子全文

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	用於物體交互感知的非局部注意區域
2.	使用生成對抗網路應用於網域產生演算法之研究
3.	基於渲染遮罩式生成對抗網路之非成對物件轉換技術
4.	跟隨機器人-基於人臉識別模塊
5.	基於能源及需求透明化電梯動態控制機制改善之研究
6.	視覺問題生成搭配文字輔助的方法與比較
7.	可動態調整影片語義分割網路
8.	應用於域自適應學習之生成對抗引導網絡
9.	寬鬆手勢辨識技術研究
10.	應用深度學習於結合自動偵測人物的步態辨識
11.	藉由神經網絡辨識資料結構之記憶體映像分析
12.	產生性對抗式網路於無配對條件之圖片與文字間轉換
13.	基於自動駕駛之影像物件偵測技術於自定義資料集之實現與分析
14.	一個基於區域的多過濾器卷積類神經網路高效物體偵測方法
15.	基於電腦視覺及卷積神經網路進行行人航位未對準改正以改善手機行人導航

無相關期刊

1.	深度學習中序列模型之去偏見化訓練方法 —以去除聊天機器人的性別職業偏見為例
2.	視覺化程式學習對於提升高中生運算思維能力之影響
3.	深度學習與情感分析應用於股價預測
4.	基於眼動分析探討視頻廣告情境：廣告內容與多種廣告效果之關係狀態
5.	利用自我調整理論與鷹架呈現機制探討遊戲式數位學習中的問題解決歷程
6.	節點拓撲特性、社群與階層結構對於複雜網絡中散播者選擇之影響
7.	有邏輯依賴關係的強化學習問題處理
8.	以深度學習方法進行知識庫問題生成
9.	職業運動競賽陣容對決勝負預測-以國際籃球協會為例
10.	以情境興趣與高層次思考探討學生自行設計卡牌於數位卡牌遊戲的學習活動
11.	串媒體故事敘述遊戲設計方法的研究
12.	藉由眼動資料分析視訊電話使用者之角色關係
13.	音樂節奏遊戲的音符節拍視覺化對於青少年學生學習音樂的影響
14.	沙盒式建構模擬遊戲經驗對創造思考能力的影響
15.	以串媒體來探討大眾追劇現象 ---以Facebook韓劇粉絲團為例

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室