跳到主要內容

臺灣博碩士論文加值系統

(34.204.172.188) 您好!臺灣時間:2023/09/27 19:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:吳登鈞
研究生(外文):Deng-Jyun Wu
論文名稱:神經網絡風格轉換演算法分析及人像風格應用
論文名稱(外文):Neural style transfer algorithm analysis and portrait style application
指導教授:貝蘇章
口試委員:鍾國亮黃文良丁建均曾建誠
口試日期:2019-05-21
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電信工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:69
中文關鍵詞:風格轉換影像特徵頭像轉換
DOI:10.6342/NTU201900805
相關次數:
  • 被引用被引用:0
  • 點閱點閱:259
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本篇論文主要探討風格轉換演算法之間的比較,透過convolution neural network在影像上的在影像上進行特徵萃取,可以捕捉到影像的風格及結構資訊資訊,並以此進行重建,VGG-NET能幫助電腦視覺領域在影像中萃取到許多的特徵,並透過特徵的操作完成風格轉換的工作,可以透過許多參數的設定達到不同的風格效果,然而在人像細節上的失真,是我們發現的主要問題,這大幅的限制了它的應用,我們透過逐步處理不同區間,完成了風格轉換上容易產生風格頭像失真的情況。論文的第一個部分主要談討Gatys 在2015年提出充滿創意的想法。透過捲曲網絡進行風格轉換的演算法,利用VGG-NET 中不同捲曲層特徵進行重建以及 Gram-matrix的設計,得到了不錯的效果,並可以得知捲曲網絡在不同層次所觀察的影像特徵間的差異,根據不同層次的影像特徵,可以達到不同的轉換效果。這也使我們得知捲曲網絡可以得到豐富的豐富的特徵,也為風格轉換提出了新的想像然而計算的複雜度卻限制了它的應用,所以有許多演算法在改良執行速度下被提出。而本文的第二部分主要透過編碼及解碼器的設計、Whiting & Coloring、特徵映射的設計,亦利用VGG-NET 中不同捲曲層特徵進行重建,得到了不同的轉換效果。提高風格轉換的效率同時完成了任意風格轉換的可操作性。第三個部分則是我們發現在人像風格轉換容易失真的缺點,所以我們將影像進行切割,讓神經網絡能夠更準確地抓取重點影像的特徵,讓風格轉換能在日常裝置更容易的被運用及操作。因此我們提出的演算法是利用現行的universal style transfer via feature transforms ,利用影像的前處理,進行分割,並個別完成風格轉換後的重組影像,讓風格轉換演算法亦可在人像照片運用。
This paper mainly discusses the comparison between style conversion algorithms. Through the feature extraction on the image through the convolution neural network, the image style and structure information can be captured and reconstructed. VGG-NET can help the computer vision field extract a lot of features in the image, and through the operation of the feature to complete the style conversion work, can achieve different style effects through the setting of many parameters, but the distortion in the portrait details is the main problem we found. This greatly limits its application. By gradually processing different intervals, we have completed the situation where the style conversion is prone to distortion of the style avatar. The first part of the paper focuses on the idea that Getys proposed in 2015. The algorithm for style conversion through the curly network, using the different curl layer features in VGG-NET for reconstruction and Gram-matrix design, has achieved good results, and can know the image characteristics of the curl network observed at different levels. Differences, according to different levels of image features, can achieve different conversion effects. This also makes us know that the curly network can get rich and rich features, and also proposes new imagination for style conversion. However, the computational complexity limits its application, so many algorithms are proposed to improve the execution speed. The second part of the thesis mainly uses the coding and decoder design, Whiting & Coloring, feature mapping design, and also uses the different curl layer features in VGG-NET to reconstruct, and obtain different conversion effects. Improve the efficiency of style conversion while completing the operability of any style conversion. The third part is that we find that the portrait style conversion is easy to be distorted, so we cut the image so that the neural network can capture the features of the key image more accurately, so that the style conversion can be used more easily in everyday devices. And operation. Therefore, our proposed algorithm utilizes the current universal style transfer via feature transforms, uses image pre-processing, performs segmentation, and individually completes the reconstructed image after style conversion, so that the style conversion algorithm can also be used in portrait photos.
口試委員會審定書 #
誌謝 i
中文摘要 ii
ABSTRACT iii
CONTENTS v
LIST OF FIGURES vii
LIST OF TABLES ix
Chapter 1 Introduction 1
Chapter 2 Neural style transfer 7
2.1 Introduction 7
2.2 Convolution Neural network 11
2.2.1 Convolution layer 12
2.2.2 Pooling layer 13
2.2.3 Flattern layer 13
2.3 VGG-neural network 14
2.4 Deep feature representations 17
2.4.1 Content Loss Representation 19
2.4.2 Style Loss Representation 23
2.5 Neural style transfer result 30
2.5.1 Single-layer transfer 32
2.5.2 Multi-layer transfer 34
2.6 Advantages and disadvantages 35
Chapter 3 Fast universal style transfer 38
3.1 Introduction…………………….. 38
3.2 Encoder-Decoder Architecture 39
3.3 Decoder building 41
3.4 Whitening & Coloring Transforms 44
3.4.1 Whitening step 44
3.4.2 Coloring step 45
3.5 Style transfer 48
3.5.1 Single-layer transfer 48
3.5.2 Multi-layer transfer 50
3.6 Advantages and disadvantages 55
Chapter 4 Portrait style transfer 56
4.1 Introduction 56
4.1.1 Neural style transfer algorithm 56
4.1.2 Universal style encoder-decoder transfer 57
4.2 Portrait style transfer pipeline 57
4.2.1 Stylized face distortion 59
4.3 Face detection 59
4.4 Portrait style transfer 62
Chapter 5 Conclusion and future application 65
5.1 Future application 65
REFERENCE 68
[1]A. Hertzmann, "Painterly rendering with curved brush strokes of multiple sizes," in Proceedings of the 25th annual conference on Computer graphics and interactive techniques, 1998, pp. 453-460: ACM.
[2]B. Gooch and A. Gooch, Non-photorealistic rendering. AK Peters/CRC Press, 2001.
[3]L. A. Gatys, A. S. Ecker, and M. Bethge, "Image style transfer using convolutional neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2414-2423.
[4]Y. LeCun, Y. J. T. h. o. b. t. Bengio, and n. networks, "Convolutional networks for images, speech, and time series," vol. 3361, no. 10, p. 1995, 1995.
[5]J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," in European conference on computer vision, 2016, pp. 694-711: Springer.
[6]Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M.-H. Yang, "Universal style transfer via feature transforms," in Advances in neural information processing systems, 2017, pp. 386-396.
[7]A. A. Efros and W. T. Freeman, "Image quilting for texture synthesis and transfer," in Proceedings of the 28th annual conference on Computer graphics and interactive techniques, 2001, pp. 341-346: ACM.
[8]N. J. I. C. G. Ashikhmin and Applications, "Fast texture transfer," vol. 23, no. 4, pp. 38-43, 2003.
[9]A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097-1105.
[10]J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in 2009 IEEE conference on computer vision and pattern recognition, 2009, pp. 248-255: Ieee.
[11]K. Simonyan and A. J. a. p. a. Zisserman, "Very deep convolutional networks for large-scale image recognition," 2014.
[12]A. Mahendran and A. Vedaldi, "Understanding deep image representations by inverting them," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5188-5196.
[13]K. Cho et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," 2014.
[14]F. A. Gers, J. Schmidhuber, and F. Cummins, "Learning to forget: Continual prediction with LSTM," 1999.
[15]I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," in Advances in neural information processing systems, 2014, pp. 3104-3112.
[16]J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
[17]P. Viola and M. J. C. Jones, "Rapid object detection using a boosted cascade of simple features," vol. 1, pp. 511-518, 2001.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top