跳到主要內容

臺灣博碩士論文加值系統

(44.210.99.209) 您好!臺灣時間:2024/04/15 13:03
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:李易翰
研究生(外文):Yi-Han Lee
論文名稱:多尺度可變形卷積對齊網路應用於影片超解析
論文名稱(外文):Multiscale Deformable Convolution Alignment Network for Video Super Resolution
指導教授:范國清范國清引用關係高巧汶
指導教授(外文):Kuo-Chin FanChiao-Wen Kao
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:72
中文關鍵詞:影片超解析可變形卷積多尺度模型注意力機制
外文關鍵詞:Video super resolutionMultiscale modelDeformable ConvolutionAttention mechanism
相關次數:
  • 被引用被引用:0
  • 點閱點閱:113
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著科技進步,能夠拍出高解析度影像的設備以及高解析度的顯示器已經隨手可得,許多過去儲存的影片電子檔的解析度相比現今流通的顯示器顯得很低,這時候就需要用到影片超解析演算法來提升影片的解析度,此外影片超解析也能夠應用在網路傳輸,能夠在傳輸影片之前下降影片解析度,傳輸完畢後再利用影片超解析演算法將影片的解析度還原,進而達到節省流量的效果。
本篇論文提出了一個基於多尺度可變形卷積來進行影像對齊的影片超解析演算法,本文利用多尺度模型的概念,使用不同解析度的分支來預測可變形卷積的偏差值,進而增強對齊模組,並且使用SE block來整合特徵傳遞階段產生的影像特徵,幫助模型找出重要的特徵用來重建影像。本論文使用Reds資料集以及Vimeo-90k對模型進行訓練以及測試,在Reds提供的測試資料集Reds4上測試能夠超越basicVSR++0.07dB的PSNR,在視覺方面則是能夠生成出較為清晰、銳利的紋理。
As a result of highly developed technology, high-resolution devices and screens are extremely easy to obtain nowadays. The display problem with distorted image which occurs on the current monitor is due to the low resolution of the traditional video. To reconstruct the low resolution video, the video super-resolution techniques are helpful in quickly generating high-resolution video. Hence, this paper proposes
a video super-resolution algorithm adopting multi-scale deformable convolution for image alignment to improve the visual quality of the to generate a video with improved visual quality as our final product. In order to enhance the alignment module, the multi-scale model and branches of different resolutions are utilized to predict the deviation value of deformable convolution. Then, the better quality reconstruction video relies heavily on the SE block architecture which is applied to integrate the image features generated from the feature propagation stage in order to help find the better image features for image alignment module.
The results of the experiments utilizing the REDs and Vimeo-90k datasets indeed generate better visual quality and high-resolution videos. The proposed algorithm has achieved the best performance. It is 0.07dB higher than the other three methods in PSNR value, and the comparison and ablation experiment results proved the effectiveness of the proposed algorithm.
中文摘要 i
英文摘要 vii
目錄 viii
圖目錄 x
表目錄 xii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 論文架構 3
第二章 相關文獻 4
2.1 影片超解析(Video super resolution) 4
2.2 基於深度學習的上採樣方法 5
2.2.1 轉置卷積(Transposed convolution) 5
2.2.2 Pixel Shuffle 6
2.3 可變形卷積(Deformable convolution) 7
2.4 BasicVSR 9
2.5 BasicVSR++ 11
2.6 多尺度模型(Multiscale module) 12
2.6.1 平行多分支結構 12
2.6.2 串行多分支結構 13
2.7 注意力機制(Attention mechanism) 14
2.7.1 通道注意力(Channel attention) 14
2.8 空間注意力(Spatial attention) 15
第三章 方法介紹 16
3.1 系統架構 16
3.2 Symbol Define 17
3.3 特徵擷取 18
3.4 雙向遞迴特徵傳遞 19
3.5 光流預測模組 21
3.6 光流引導的多尺度可變形卷積對齊 22
3.7 影像重建與上採樣 24
3.8 模型訓練細節 26
第四章 實驗成果 27
4.1 資料集 27
4.1.1 REalistic and Dynamic Scenes dataset Vimeo-90K 27
4.1.2 Vid4 28
4.1.3 Vimeo-90K 29
4.2 驗證指標 31
4.2.1 PSNR 31
4.2.2 SSIM 32
4.3 實驗環境與設定 33
4.4 不同的多尺度偏差值預測方法對模型的影響 34
4.5 不同分支數的FMSD對效能的影響 39
4.6 注意力機制對模型的影響 42
4.7 在不同物件移動速度下的效能比較 43
4.8 FMSD與BasicVSR++的特徵圖視覺化比較 43
4.9 結果討論 45
4.10 失敗的結果 54
第五章 結論與未來展望 55
參考文獻 56
[1] Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J., Wang, Z., & Shi, W., " Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation.", In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4778-4787, 2017.
[2] Wang, L., Guo, Y., Lin, Z., Deng, X., & An, W., " Learning for Video Super-Resolution through HR Optical Flow Estimation.", In Asian Conference on Computer Vision, pp. 514-529, 2018.
[3] Xue, T., Chen, B., Wu, J., Wei, D., & Freeman, W. T., " Video Enhancement with Task-Oriented Flow." International Journal of Computer Vision, Vol 127, No.8 pp, 1106-1125. 2019.
[4] Dai, J., Li, Y., He, K., & Sun, J., "R-FCN: Object Detection via Region-Based Fully Convolutional Networks.", Advances in neural information processing systems, 29, 2016.
[5] Tian, Y., Zhang, Y., Fu, Y., & Xu, C., " TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3360-3369, June 2020.
[6] Wang, X., Chan, K. C., Yu, K., Dong, C., & Change Loy, C., " EDVR: Video Restoration with Enhanced Deformable Convolutional Networks.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0-0, June 2019.
[7] Chan, K. C., Zhou, S., Xu, X., & Loy, C. C., "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5972-5981, 2022.
[8] He, K., Zhang, X., Ren, S., & Sun, J., " Deep Residual Learning for Image Recognition.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[9] Huang, G., Liu, Z., Van Der Maaten, L., " Densely Connected Convolutional Networks.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700-4708, 2017.
[10] Chan, K. C., Wang, X., Yu, K., Dong, C., & Loy, C. C., " BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4947-4956, 2021.
[11] Zhou, K., Li, W., Lu, L., Han, X., & Lu, J., " Revisiting Temporal Alignment for Video Restoration.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6053-6062, 2022.
[12] Isobe, T., Li, S., Jia, X., Yuan, S., Slabaugh, G., Xu, C., ... & Tian, " Video Super-Resolution with Temporal Group Attention.", In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8008-8017, 2020.
[13] Dumoulin, V., & Visin, F., " A Guide to Convolution Arithmetic for Deep Learning.", arXiv preprint arXiv:1603.07285, 2016.
[14] Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., ... & Wang, Z., "Real-Time Single Image and Video Super-Resolution using an Efficient Sub-Pixel Convolutional Neural Network.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1874-1883, 2018.
[15] Zhu, X., Hu, H., Lin, S., & Dai, J., " Deformable Convnets v2: More Deformable, Better Results.", In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9308-9316, 2019.
[16] Yu, F., & Koltun, V., "Multi-Scale Context Aggregation by Dilated Convolutions", arXiv preprint arXiv:1511.07122., 2015.
[17] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A., "Going Deeper with Convolutions.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
[18] Chen, L. C., Papandreou, G., Schroff, F., & Adam, H., " Rethinking Atrous Convolution for Semantic Image Segmentation.", arXiv preprint arXiv:1706.05587, 2017.
[19] Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S., " Feature Pyramid Networks for Object Detection.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117-2125, 2017.
[20] Hu, J., Shen, L., & Sun, G., " Squeeze-and-Excitation Networks.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132-7141, 2018.
[21] Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., & Agrawal, A., " Context Encoding for Semantic Segmentation.", In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7151-7160, 2018.
[22] Gao, Z., Xie, J., Wang, Q., & Li, P., " Global Second-Order Pooling Convolutional Networks. ", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog-nition, pp. 3024-3033,2019.
[23] Jaderberg, M., Simonyan, K., & Zisserman, A., "Spatial Transformer Networks.", Advances in neural information processing systems, 28, 2015.
[24] Wang, X., Girshick, R., Gupta, A., & He, K., "Non-Local Neural Networks.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794-7803, 2018.
[25] Ranjan, A., & Black, M. J., " Optical Flow Estimation using a Spatial Pyramid Network.", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4161-4170, 2017.
[26] Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., ... & Brox, T., "Flownet: Learning Optical Flow with Convolutional Networks.", In Proceedings of the IEEE international conference on computer vision, pp. 2758-2766, 2015.
[27] Nah, S., Baik, S., Hong, S., Moon, G., Son, S., Timofte, R., & Mu Lee, K., " Ntire 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0-0, 2019.
[28] Liu, C., & Sun, D., "On Bayesian Adaptive Video Super Resolution," IEEE transactions on pattern analysis and machine intelligence, Vol.36, No.2. pp. 346-360, 2014.
[29] Haris, M., Shakhnarovich, G., & Ukita, N., "Recurrent Back-Projection Network for Video Super-Resolution.", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3897-3906, 2019.
[30] Li, W., Tao, X., Guo, T., Qi, L., Lu, J., & Jia, J., "Mucan: Multi-correspondence aggregation network for video super-resolution.", In European conference on computer vision, pp. 335-351., 2020.
[31] Wang, H., Xiang, X., Tian, Y., Yang, W., & Liao, Q., " STDAN: Deformable Attention Network for Space-Time Video Super-Resolution.", arXiv preprint arXiv:2203.06841, 2022.
[32] Shim, G., Park, J., & Kweon, I. S., "Robust Reference-Based Super-Resolution with Similarity-Aware Deformable Convolution.", In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8425-8434, 2020.
電子全文 電子全文(網際網路公開日期:20250706)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top