跳到主要內容

臺灣博碩士論文加值系統

(44.201.92.114) 您好!臺灣時間:2023/04/01 14:46
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:韓 磊
研究生(外文):Han, Lei
論文名稱:多尺度型生成對抗網路於視頻解模糊
論文名稱(外文):Multi-scale GAN for Video Deblurring
指導教授:賴尚宏
指導教授(外文):Lai, Shang-Hong
口試委員:邱瀞德許秋婷
口試委員(外文):Chiu, Ching-TeHsu, Chiu-Ting
口試日期:2018-08-24
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:107
語文別:英文
論文頁數:37
中文關鍵詞:生成对抗网路解模糊
外文關鍵詞:GANdeblur
相關次數:
  • 被引用被引用:0
  • 點閱點閱:225
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
隨著深度學習的快速發展,通過一個個深度模型,越來越多的視覺問題得以解決。對於視頻解模糊這個任務,已經有一些深度模型被提出。與此同時,生成對抗網路也由於它強大的能力而被廣泛應用到許多場景。在這篇論文中,我們將傳統方法中經典的多尺度結構與生成對抗網路相結合以來解決視頻解模糊問題。從試驗結果中可以發現,無論是從數值度量上還是視覺直觀上,我們的模型都比一些其他的深度模型算法更加穩定。最後,我們也從不同的角度分析了我們的模型的優越性。
With the rapid development of deep learning, more and more vision problems can be solved by using deep neural network models. Several deep models have been released to handle video deblurring task. At the same time, generative adversarial networks (GANs) are also widely used in many kinds of problems for its strength. In this thesis, we connect the classical multi-scale structure in traditional vision methods with GAN for video deblurring. The quantitative and qualitative results of our experiments demonstrate the proposed model can restore frames more robustly than the state-of-the-art video deblurring deep methods. We also justify the superiority of our model from different perspectives.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Description . . . . . . . . . . . . . . . . . . . . 2
1.3 Main Contributions . . . . . . . . . . . . . . . . . . . . . 3
2 Related works . . . . . . . . . . . . . . . . . . . .. . . . . 5
2.1 Image/Video Deblurring . . . . . . . . . . . . . . . . . . . 5
2.2 Generative Adversarial Networks .. . . . . . . . . . . . . . 6
3 Proposed Model . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1 Network Architecture . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.2 Discriminator . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Loss function . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Training details . . . . . . . . . . . . . . . . . . . . . 16
4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Comparison . . . . . . . . . . . . .. . . . . . . . . . . . 20
4.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . 24
4.4.1 Loss Function and Hyperparameters . . . . . . . . . . . . 24
4.4.2 Network structure . . . . . . . . . . . . . . . . . . . . 26
5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 34
References . . . . . . . . . . . . . . . . . . . . . . . . . . 35
[1] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas, “Deblurgan:
Blind motion deblurring using conditional adversarial networks,” CoRR,
vol. abs/1711.07064, 2017.
[2] Y. Chen, Y.-K. Lai, and Y.-J. Liu, “Cartoongan: Generative adversarial networks
for photo cartoonization,” 2018.
[3] P. Wieschollek, M. Hirsch, B. Schölkopf, and H. Lensch, “Learning blind motion
deblurring,” in Proceedings IEEE International Conference on Computer Vision
(ICCV), (Piscataway, NJ, USA), pp. 231–240, IEEE, Oct. 2017.
[4] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” CoRR,
vol. abs/1709.01507, 2017.
[5] A. He, C. Luo, X. Tian, and W. Zeng, “A twofold siamese network for real-time
object tracking,” CoRR, vol. abs/1802.08817, 2018.
[6] S. Nah, T. H. Kim, and K. M. Lee, “Deep multi-scale convolutional neural network
for dynamic scene deblurring,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), July 2017.
[7] A. Chakrabarti, “A neural approach to blind motion deblurring,” in ECCV, 2016.
[8] S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang, “Deep video
deblurring for hand-held cameras,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 1279–1288, 2017.
[9] T. H. Kim, K. M. Lee, B. Scholkopf, and M. Hirsch, “Online Video Deblurring via
Dynamic Temporal Blending Network,” Proceedings of the IEEE International
Conference on Computer Vision, vol. 2017-October, pp. 4058–4067, 2017.
[10] P. Wieschollek, B. Schölkopf, H. P. A. Lensch, and M. Hirsch, “End-to-end learning
for image burst deblurring,” in Computer Vision - ACCV 2016 - 13th Asian
Conference on Computer Vision, vol. 10114 of Image Processing, Computer Vision,
Pattern Recognition, and Graphics, pp. 35–51, Springer, 2017.
[11] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation,
vol. 9, pp. 1735–1780, 1997.
[12] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman, “Understanding and evaluating
blind deconvolution algorithms,” 2009 IEEE Conference on Computer Vision and
Pattern Recognition, pp. 1964–1971, 2009.
[13] D. Krishnan and R. Fergus, “Fast image deconvolution using hyper-laplacian priors,”
in NIPS, 2009.
[14] T. F. Chan and C.-K. Wong, “Total variation blind deconvolution,” IEEE transactions
on image processing : a publication of the IEEE Signal Processing Society,
vol. 7 3, pp. 370–5, 1998.
[15] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. C. Courville, and Y. Bengio, “Generative adversarial networks,” CoRR,
vol. abs/1406.2661, 2014.
[16] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. A. Cunningham, A. Acosta, A. P.
Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photo-realistic single image
super-resolution using a generative adversarial network,” 2017 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), pp. 105–114, 2017.
[17] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for
improved quality, stability, and variation,” CoRR, vol. abs/1710.10196, 2017.
[18] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution
image synthesis and semantic manipulation with conditional gans,”
CoRR, vol. abs/1711.11585, 2017.
[19] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with
conditional adversarial networks,” 2017 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 5967–5976, 2017.
[20] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation
using cycle-consistent adversarial networks,” 2017 IEEE International Conference
on Computer Vision (ICCV), pp. 2242–2251, 2017.
[21] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,”
in ICML, 2017.
[22] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved
training of wasserstein gans,” in NIPS, 2017.
[23] X. Xu, D. Sun, J. Pan, Y. Zhang, H. Pfister, and M.-H. Yang, “Learning to superresolve
blurry face and text images,” 2017 IEEE International Conference on
Computer Vision (ICCV), pp. 251–260, 2017.
[24] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for
biomedical image segmentation,” in MICCAI, 2015.
[25] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
2016 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 770–778, 2016.
[26] A. L. Maas, “Rectifier nonlinearities improve neural network acoustic models,”
2013.
[27] D. Ulyanov, A. Vedaldi, and V. S. Lempitsky, “Instance normalization: The missing
ingredient for fast stylization,” CoRR, vol. abs/1607.08022, 2016.
[28] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training
by reducing internal covariate shift,” in ICML, 2015.
[29] X. Wei, B. Gong, Z. Liu, W. Lu, and L. Wang, “Improving the improved
training of wasserstein gans: A consistency term and its dual effect,” CoRR,
vol. abs/1803.01541, 2018.
[30] M. Arjovsky and L. Bottou, “Towards principled methods for training generative
adversarial networks,” CoRR, vol. abs/1701.04862, 2017.
[31] Y. Wu et al., “Tensorpack.” https://github.com/tensorpack/, 2016.
[32] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado,
A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving,
M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané,
R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,
I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas,
O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow:
Large-scale machine learning on heterogeneous distributed systems,” CoRR,
vol. abs/1603.04467, 2015.
[33] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR,
vol. abs/1412.6980, 2014.
[34] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” CoRR, vol. abs/1409.1556, 2014.
[35] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A Large-
Scale Hierarchical Image Database,” in CVPR09, 2009.
[36] X. Tao, H. Gao, Y. Wang, X. Shen, J. Wang, and J. Jia, “Scale-recurrent network
for deep image deblurring,” CoRR, vol. abs/1802.01770, 2018.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top