跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.168) 您好!臺灣時間:2024/12/06 01:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張佳豪
研究生(外文):CHANG, CHIA-HAO
論文名稱:運用結構保存細節強化網路於基於參考的影像超解析度
論文名稱(外文):Structure Preservation and Detail Enhancement for Reference-based Image Super-Resolution
指導教授:許巍嚴
指導教授(外文):HSU, WEI-YEN
口試委員:戴顯權劉偉名許巍嚴
口試委員(外文):TI, SHEN-CHUANLIU, WEI-MINHSU, WEI-YEN
口試日期:2023-07-11
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊管理系研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:中文
論文頁數:70
中文關鍵詞:基於參考的影像超解析度對應配對結構保存細節保真度紋理轉移改良版三重注意力
外文關鍵詞:Reference-based Image Super-ResolutionCorrespondence MatchingStructure PreservationDetail FidelityTexture TransferImproved Triplet Attention
相關次數:
  • 被引用被引用:0
  • 點閱點閱:80
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
基於參考的影像超解析度(RefSR)利用高解析度(HR)參考影像來提升低解析度(LR)輸入影像的品質和細節,是一種具有前景的方法。相較於單一影像超解析度(SISR),它能夠進一步利用外部參考影像來產生更真實和豐富的紋理。最近,這種任務已經引起了廣泛的關注和發展。然而,RefSR任務仍然有兩個關鍵的挑戰:1)現有的RefSR方法大多專注於計算參考影像和低解析度影像之間的對應配對,而忽略了特徵提取階段的重要性。我們認為簡單地對影像的原始特徵進行操作,容易引入不相關和冗餘的訊息,即使有良好的對應關係,重建出的影像仍然可能惡化,導致了不穩健的對應配對、紋理轉移以及感知品質的下降,尤其是在結構和細節方面。2)從參考影像中提取相關紋理來補足LR影像的細節是一個非常具有挑戰性的問題。為了解決這些問題,我們提出了一種新穎的結構保存和細節強化超解析度網路(SPDE-SR),用於基於參考的影像超解析度,專注於結構保存和細節強化。SPDE-SR主要由三個基本模組所組成,包括結構保存特徵提取模組(Structure-Preserving Feature Extraction [SPFE] Module)、動態特徵聚合模組(Dynamic Feature Aggrregation [DFA] Module)與細節保真恢復模組(Detail Fidelity Restoration [DFR] Module)。具體來說,SPFE模組首先從LR影像和參考影像中有效地提取相關特徵,並透過對比學習和知識蒸餾進行預訓練以獲得穩健的對應關係。其次,DFA模組聚合參考影像的特徵,穩健地將參考影像的紋理轉移至LR影像。最後,DFR模組透過所提出的改良版三重注意力機制(Improved Triplet Attention [ITA] Mechanism),進一步重建具有更多細節保真度和增強的SR影像。廣泛的實驗結果證明,我們提出的SPDE-SR方法優於最先進的RefSR方法,特別是在結構保存和細節強化方面。
Reference-based image super-resolution (RefSR) is a promising approach that utilizes high-resolution (HR) reference images to enhance the quality and details of low-resolution (LR) input images. Compared with single image super-resolution (SISR), RefSR uses external reference images to create more realistic textures. Recently, RefSR has garnered interest and development. However, there are two major challenges: 1) Most RefSR methods ignore feature extraction and focus on correspondence matching between the reference and LR images. We hypothesize that simply operating the original image features can easily introduce irrelevant and redundant information. Even with good correspondence, the reconstructed image may still deteriorate. This results in inaccurate correspondence matching, texture transfer, and perceptual quality degradation, especially in structure and details. 2) Extracting relevant textures from reference images to supplement the details of LR images is a highly challenging problem. To address these issues, we propose a novel Structure Preservation and Detail Enhancement Super-Resolution (SPDE-SR) network for reference-based image super-resolution focusing on structure preservation and detail enhancement. SPDE-SR has three primary modules: structure-preserving feature extraction (SPFE), dynamic feature aggregation (DFA), and detail fidelity restoration (DFR). More specifically, the SPFE module first effectively extracts relevant features from the LR and reference images, and pre-training is conducted through contrastive learning and knowledge distillation to obtain a robust correspondence. The DFA module then aggregates the features of the reference images for robust texture transfer to the LR image. Finally, the DFR module further reconstructs the SR images with more detail fidelity and enhancement through the proposed improved triplet attention (ITA) mechanism. The experimental results demonstrate that the proposed SPDE-SR method outperforms the state-of-the-art approaches in four benchmark datasets, especially in structure preservation and detail enhancement.
致謝 i
摘要 ii
Abstract iii
目錄 iv
圖目錄 vii
表目錄 ix
第一章、緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究問題與目的 3
1.4 研究貢獻 6
第二章、文獻探討 7
2.1 單影像超解析度 7
2.2 運用深度學習之單影像超解析度 9
2.2.1 SRCNN 10
2.2.2 SRGAN 11
2.2.3 EDSR 13
2.2.4 RCAN 15
2.2.5 ESRGAN 17
2.2.6 RankSRGAN 18
2.2.7 SwinIR 19
2.3 基於參考的影像超解析度 22
2.4 運用深度學習之基於參考的影像超解析度 23
2.4.1 CrossNet 24
2.4.2 SRNTT 25
2.4.3 TTSR 26
2.4.4 SSEN 27
2.4.5 E2ENT2 28
2.4.6 MASA 29
2.4.7 WTRN 30
2.4.8 C2-Matching 31
2.4.9 AMSA 33
第三章、材料與研究方法 35
3.1 實驗材料 35
3.2 研究架構與流程 38
3.3 研究方法之步驟 39
3.3.1 結構保存特徵提取模組 40
3.3.2 動態特徵聚合模組 41
3.3.3 細節保真恢復模組 42
3.3.4 損失函數 42
第四章、實驗評估 44
4.1 實驗環境 44
4.2 實驗評估指標 44
4.2.1 專家判斷 44
4.2.2 峰值訊號雜訊比(Peak Signal to Noise Ratio [PSNR]) 45
4.2.3 結構相似性(Structural Similarity [SSIM]) 45
4.2.4 可學習感知影像塊相似性(Learned Perceptual Image Patch Similarity [LPIPS]) 46
4.3 實驗結果 46
4.3.1 定量結果 47
4.3.2 定性結果 48
4.3.3 額外分析 50
4.4 消融實驗 52
4.4.1 結構保存特徵提取模組 53
4.4.2 動態特徵聚合模組 53
4.4.3 細節保真恢復模組 53
第五章、結論與未來展望 54
5.1 結論 54
5.2 未來展望 54
參考文獻 56
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. International conference on machine learning,
Boominathan, V., Mitra, K., & Veeraraghavan, A. (2014). Improving resolution and depth-of-field of light field cameras using a hybrid imaging system. 2014 IEEE International Conference on Computational Photography (ICCP),
Cherukuri, V., Guo, T., Schiff, S. J., & Monga, V. (2019). Deep MR brain image super-resolution using spatio-structural priors. IEEE Transactions on Image Processing, 29, 1368-1383.
Dong, C., Loy, C. C., He, K., & Tang, X. (2015). Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2), 295-307.
Geng, T., Liu, X.-Y., Wang, X., & Sun, G. (2021). Deep shearlet residual learning network for single image super-resolution. IEEE Transactions on Image Processing, 30, 4129-4142.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., & Courville, A. B., Yoshua (2014). Generative adversarial networks. Advances in neural information processing systems, 2672-2680.
Gu, J., Cai, H., Chen, H., Ye, X., Ren, J., & Dong, C. (2020). Image quality assessment for perceptual image restoration: A new dataset, benchmark and metric. arXiv preprint arXiv:2011.15002.
He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
Hinton, G., Vinyals, O., & Dean, J. (2015a). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2(7).
Hinton, G., Vinyals, O., & Dean, J. (2015b). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
Hore, A., & Ziou, D. (2010). Image quality metrics: PSNR vs. SSIM. 2010 20th international conference on pattern recognition,
Huang, J.-B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. Proceedings of the IEEE conference on computer vision and pattern recognition,
Huang, S.-C., Hoang, Q.-V., & Jaw, D.-W. (2022). Self-Adaptive Feature Transformation Networks for Object Detection in low luminance Images. ACM Trans. Intell. Syst. Technol., 13(1), Article 13. https://doi.org/10.1145/3480973
Hwang, S., Park, J., Kim, N., Choi, Y., & So Kweon, I. (2015). Multispectral pedestrian detection: Benchmark dataset and baseline. Proceedings of the IEEE conference on computer vision and pattern recognition,
Irani, M., & Peleg, S. (1991). Improving resolution by image registration. CVGIP: Graphical models and image processing, 53(3), 231-239.
Jaderberg, M., Simonyan, K., & Zisserman, A. (2015). Spatial transformer networks. Advances in neural information processing systems, 28.
Jiang, Y., Chan, K. C., Wang, X., Loy, C. C., & Liu, Z. (2022). Reference-based Image and Video Super-Resolution via C2-Matching. IEEE transactions on pattern analysis and machine intelligence.
Jinjin, G., Haoming, C., Haoyu, C., Xiaoxing, Y., Ren, J. S., & Chao, D. (2020). Pipal: a large-scale image quality assessment dataset for perceptual image restoration. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16,
Keys, R. (1981). Cubic convolution interpolation for digital image processing. IEEE transactions on acoustics, speech, and signal processing, 29(6), 1153-1160.
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z. (2017). Photo-realistic single image super-resolution using a generative adversarial network. Proceedings of the IEEE conference on computer vision and pattern recognition,
Li, Z., Kuang, Z.-S., Zhu, Z.-L., Wang, H.-P., & Shao, X.-L. (2022). Wavelet-Based Texture Reformation Network for Image Super-Resolution. IEEE Transactions on Image Processing, 31, 2647-2660.
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., & Timofte, R. (2021). Swinir: Image restoration using swin transformer. Proceedings of the IEEE/CVF International Conference on Computer Vision,
Lim, B., Son, S., Kim, H., Nah, S., & Mu Lee, K. (2017). Enhanced deep residual networks for single image super-resolution. Proceedings of the IEEE conference on computer vision and pattern recognition workshops,
Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., & Xie, S. (2022). A convnet for the 2020s. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Lu, L., Li, W., Tao, X., Lu, J., & Jia, J. (2021). Masa-sr: Matching acceleration and spatial adaptation for reference-based image super-resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Lucas, A., Lopez-Tapia, S., Molina, R., & Katsaggelos, A. K. (2019). Generative adversarial networks and perceptual losses for video super-resolution. IEEE Transactions on Image Processing, 28(7), 3312-3327.
Lugmayr, A., Danelljan, M., & Timofte, R. (2020). Ntire 2020 challenge on real-world image super-resolution: Methods and results. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,
Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., & Aizawa, K. (2017). Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20), 21811-21838.
Misra, D., Nalamada, T., Arasanipalai, A. U., & Hou, Q. (2021). Rotate to attend: Convolutional triplet attention module. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,
Radenović, F., Tolias, G., & Chum, O. (2018). Fine-tuning CNN image retrieval with no human annotation. IEEE transactions on pattern analysis and machine intelligence, 41(7), 1655-1668.
Sajjadi, M. S., Scholkopf, B., & Hirsch, M. (2017). Enhancenet: Single image super-resolution through automated texture synthesis. Proceedings of the IEEE international conference on computer vision,
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Proceedings of the IEEE conference on computer vision and pattern recognition,
Shim, G., Park, J., & Kweon, I. S. (2020). Robust reference-based super-resolution with similarity-aware deformable convolution. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations.
Sun, L., & Hays, J. (2012). Super-resolution from internet-scale scene matching. 2012 IEEE International conference on computational photography (ICCP),
Tatem, A. J., Lewis, H. G., Atkinson, P. M., & Nixon, M. S. (2001). Super-resolution target identification from remotely sensed images using a Hopfield neural network. IEEE Transactions on Geoscience and Remote Sensing, 39(4), 781-796.
Wang, X., Girshick, R., Gupta, A., & He, K. (2018). Non-local neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition,
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., & Change Loy, C. (2018). Esrgan: Enhanced super-resolution generative adversarial networks. Proceedings of the European conference on computer vision (ECCV) workshops,
Wang, Y., Liu, Y., Heidrich, W., & Dai, Q. (2016). The light field attachment: Turning a dslr into a light field camera using a low budget camera ring. IEEE transactions on visualization and computer graphics, 23(10), 2357-2364.
Wu, J., Wang, H., Wang, X., & Zhang, Y. (2015). A novel light field super-resolution framework based on hybrid imaging system. 2015 Visual Communications and Image Processing (VCIP),
Xia, B., Tian, Y., Hang, Y., Yang, W., Liao, Q., & Zhou, J. (2022). Coarse-to-fine embedded patchmatch and multi-scale dynamic aggregation for reference-based super-resolution. Proceedings of the AAAI Conference on Artificial Intelligence,
Xie, Y., Xiao, J., Sun, M., Yao, C., & Huang, K. (2020). Feature representation matters: End-to-end learning for reference-based image super-resolution. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV,
Yang, F., Yang, H., Fu, J., Lu, H., & Guo, B. (2020). Learning texture transformer network for image super-resolution. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
Yang, J., Wright, J., Huang, T., & Ma, Y. (2008). Image super-resolution as sparse representation of raw image patches. 2008 IEEE conference on computer vision and pattern recognition,
Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861-2873.
Zhang, M., & Ling, Q. (2020). Supervised pixel-wise GAN for face super-resolution. IEEE Transactions on Multimedia, 23, 1938-1950.
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. Proceedings of the IEEE conference on computer vision and pattern recognition,
Zhang, W., Liu, Y., Dong, C., & Qiao, Y. (2019). Ranksrgan: Generative adversarial networks with ranker for image super-resolution. Proceedings of the IEEE/CVF International Conference on Computer Vision,
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., & Fu, Y. (2018). Image super-resolution using very deep residual channel attention networks. Proceedings of the European conference on computer vision (ECCV),
Zhang, Z., Wang, Z., Lin, Z., & Qi, H. (2019). Image super-resolution by neural texture transfer. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
Zheng, H., Guo, M., Wang, H., Liu, Y., & Fang, L. (2017). Combining exemplar-based approach and learning-based approach for light field super-resolution using a hybrid imaging system. Proceedings of the IEEE international conference on computer vision workshops,
Zheng, H., Ji, M., Han, L., Xu, Z., Wang, H., Liu, Y., & Fang, L. (2017). Learning Cross-scale Correspondence and Patch-based Synthesis for Reference-based Super-Resolution. BMVC,
Zheng, H., Ji, M., Wang, H., Liu, Y., & Fang, L. (2018). Crossnet: An end-to-end reference-based super resolution network using cross-scale warping. Proceedings of the European conference on computer vision (ECCV),

電子全文 電子全文(網際網路公開日期:20280716)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top