跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.86) 您好!臺灣時間:2025/02/07 22:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:江博任
研究生(外文):Paul-Jein Chiang
論文名稱:基於卷積神經網路之盲蔽型立體圖像品質估測
論文名稱(外文):Blind Stereoscopic Image Quality Assessment By Convolutional Neural Network
指導教授:劉宗榮劉宗榮引用關係
指導教授(外文):Tsung-Jung Liu
口試委員:吳國光劉冠顯
口試委員(外文):Kuo-Guan Wu
口試日期:2017-07-28
學位類別:碩士
校院名稱:國立中興大學
系所名稱:電機工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:53
中文關鍵詞:品質評估立體圖像卷積神經網路失真型態
外文關鍵詞:quality assessmentstereoscopic imageConvolutional Neural Networkdistortion type
相關次數:
  • 被引用被引用:0
  • 點閱點閱:310
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在這篇論文中,我們提出了以卷積神經網路(Convolutional Neural Network(CNN))為技術基礎的盲蔽型客觀立體影像品質估測器,首先,將影像切成較小的圖塊(patch)當成輸入,放進由多個卷積(Convolutional)以及最大池化(Max-pooling)層交叉組合的架構,藉此得到這種從低階到高階訊息都包含的特徵圖像,最後再經過一多層感知(Multilayer perceptron(MLP))層,將圖像特徵資訊總結,得到預測的立體影像品質分數。透過上述的方法架構,模型將被訓練成與人類視覺系統高度相似的品質估測器,也就是和人類主觀品質評分有很高的相關程度,以此機器模型取代人類去評估影像,便能達到節省成本的目的。
為了跟上進步快速的科技腳步,我們設計圖像品質估測器的出發點是針對立體圖像而非傳統的平面圖像,三維跟二維的影像品質估測器最大的不同在於立體影像品質估測器需要考慮立體感知資訊,但二維則不需要。因此在設計估測器時,我們利用左圖減右圖得到簡單的擬似視差圖(Pseudo-disparity),以此為輸入來代表立體感知的資訊。在我們的實驗中,分別對LIVE 立體圖像資料庫 phase-Ⅰ和phase-Ⅱ和MCL立體圖像公開資料庫,進行品質評估。相較於其他神經網絡,卷積神經網路(CNN)省去了複雜的特徵提取和數據重建過程,需要運算的參數更少,故使之成為一種優秀的深度學習架構。在此架構中,以圖塊(patch)做為第一層的輸入,再依次傳輸到不同的層,每層透過濾波器去獲得最顯著而且優秀的特徵;因此,在經過層層的運算處理後,模型能夠捕捉從基礎到更高層級的特徵,再將特徵經過一多層感知機(MLP),得到最終的預測分數。
除了使用舊有的卷積神經網路(CNN)之外,我們提出一種新的二進制損失函數為單一卷積神經網路(Single Stage CNN)的計算誤差功能層,此損失函數能夠有效加快模型收斂速度以及降低預測複雜度,而為了因應此種損失函數的特性,我們更延伸出串接型卷積神經網路(Cascade CNN),先分組降低主觀圖像品質分數的範圍,再預測最終的品質分數,在辨識組別的部分,加入原失真左和右圖,使得模型能夠不只學習到視差資訊,也能夠擁有原圖的資料,提升準確度。而在預測的部分,首先運用大津(Otsu)演算法將擬似視差圖(Pseudo-disparity)切成前景(Foreground)、中景(Medium Shot)、後景(Background),加入部分景深(Depth)的資訊,使得模型不僅能夠學習視差和原圖資訊甚至還有景深(Depth)等更詳細的立體圖像感知資訊,模型擁有更完整更豐富的影像訊息能夠學習,藉此提升模型的結果表現。
In this paper, we proposed the structure of no-reference (NR) stereoscopic image quality assessment based on convolutional neural network (CNN). Taking smaller image patches from the stereoscopic image as inputs, the model can be trained to have comparable performance with human visual system (HVS). Combining multiple convolutional layers and max-pooling layers, the model can learn the information of images from low to high levels in detail. Multilayer perceptron (MLP) Layer is further employed to summarize the learned representation to a final value which can indicate the perceptual quality of the stereoscopic image patch.
To represent the stereo perception information, we take the left view image minus the right view image and use it as the input of CNN. With extensive experiments on the public LIVE 3D Phase-I and LIVE 3D Phase-II and MCL stereoscopic image databases. Moreover, we proposed a new binary loss function for the single stage neural network. It makes the model converge faster than before, but the range of subjective score of the database is too large for the binary loss function. We propose a Cascade CNN to fix this problem. Let the model separate into two parts. First, we classify the group of the image so the range of subjective score is much smaller. Second, predict the final score by five input channels model, which including the left view, right view maps and the background, medium shot, foreground of the pseudo-disparity maps. This way may enrich the stereo perceptual information, so that the model can learn more information from the image to reach the better performance.
摘要 i
Abstract iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 前言 1
1.2 研究動機 1
1.3 論文架構 2
第二章 相關研究探討 4
第三章 相關知識 6
3.1 立體感知資訊介紹 6
3.2 資料正規化介紹 8
3.3 卷積神經網路(CNN)介紹 9
3.3-1卷基層(Convolutional Layer)介紹 12
3.3-2池化層(Pooling Layer)介紹 13
3.3-3多層感知(Multilayer perceptron(MLP))層介紹 14
3.3-4激活函數(Activation Function)介紹 15
3.3-5損失函數(Loss Function)介紹 17
3.4 大津(Otsu)演算法介紹 18
3.5 二進制(Binaty system)介紹 18
3.6 資料庫介紹 19
第四章 本論文的方法 27
4.1 前言 27
4.2 圖像前處理 27
4.3 單一卷積神經網路(Single Stage CNN) 29
4.3-1 二進制損失函數(Binary Loss Function) 29
4.3-2 卷積神經網路(CNN)模型 30
4.4 串接型卷積神經網路(Cascade CNN) 33
第五章 結果與討論 41
參考文獻 48
[1] Z. Wang and A. Bovik, “Mean Squared Error: Love It or Leave It?,”IEEE Signal Processing Magazine, pp. 98–117, Jan. 2009.
[2] U. Engelke and H. J. Zepernick, “Perceptual-based Quality Metrics for Image and Video Services: A Survey,” The 3rd EuroNGI Conference on Next Generation Internet Networks, pp. 190–197, May. 2007.
[3] W. Lin, C.-C. J. Kuo, “Perceptual Visual Quality Metrics: A Survey,” Journal of Visual Communication and Image Representation, vol. 22(4), pp. 297-312, May 2011.
[4] S. Winkler, and P. Mohandas, “The evolution of video quality measurement: From PSNR to hybrid metrics,” IEEE Trans. on Broadcasting, vol. 54, no. 3, pp. 660-668, Sep. 2008.
[5] T.-J. Liu, W. Lin, and C.-C. J. Kuo, “Recent developments and future trends in visual quality assessment,” in Proc. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1-10, Oct. 2011.
[6] T.-J. Liu, Y.-C. Lin, W. Lin and C.-C. J. Kuo, “Visual quality assessment: recent developments and coding applications,” APSIPA Transactions on Signal and Information Processing, vol. 2, e4, Jul. 2013.
[7] P. Marziliano, F. Dufaux, S. Winkler, T. Ebrahimi, “A no-reference perceptual blur metric,” in Proc. of IEEE ICIP, pp. 57–60, Sep. 2002.
[8] E. Ong, W. Lin, Z. Lu, S. Yao, X. Yang, L. Jiang, “No reference JPEG-2000 image quality metric,” in Proc. of IEEE International Conference Multimedia and Expo (ICME), pp. 545-548, 2003.
[9] H. Tong, M. Li, H.-J. Zhang, and C. Zhang, “No-reference quality assessment for JPEG2000 compressed images,” in Proc. of IEEE ICIP, pp. 3539–3542, 2004.
[10] M. H. Pinson and S. Wolf, “A new standardized method for objectively measuring video quality,” IEEE Trans. on Broadcasting, vol. 50, no. 3, pp. 312–322, Sep. 2004.
[11] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Processing, vol. 13, no. 4, pp. 600–612, Apr. 2004.
[12] LIVE Image Quality Assessment Database. [Online]. Available: http://live.ece.utexas.edu/research/quality/subjective.htm
[13] Categorical Image Quality (CSIQ) Database. [Online]. Available: http://vision.okstate.edu/csiq
[14] Tampere Image Database 2008. [Online]. Available: http://www.ponomarenko.info/tid2008.htm
[15] Tampere Image Database 2013. [Online]. Available: http://www.ponomarenko.info/tid2013.htm
[16] T.-J. Liu, W. Lin, and C.-C. J. Kuo, “A multi-metric fusion approach to visual quality assessment,” in Proc., IEEE the 3rd international workshop on QoMEX, pp. 72-77, Sep. 2011.
[17] M.J. Chen, L.K. Cormack, A.C. Bovik, No-reference quality assessment of natural stereo-pairs, IEEE Trans. Image Process. (2013) 3379–3391.
[18] Z.M.P. Sazzad, S. Yamanaka, Y. Horita, Spatio-temporal segmentation based continuous no-reference stereoscopic video quality prediction, in: International Workshop on Quality of Multimedia Experience, 2010, pp. 106–111.
[19] R. Akhter, J. Baltes, Z.M. Parvez Sazzad, Y. Horita, No reference stereoscopic image quality assessment, Proc. SPIE 7524 (February) (2010).
[20] S. Ryu, K. Sohn, No-reference quality assessment for stereoscopic images based on binocular quality perception, IEEE Trans. Circuits Syst. Video Technol.(2014) 591–602.
[21] L. Kang, P. Ye, Y. Li, D. Doermann, Convolutional neural networks for no-reference image quality assessment, in: IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1733–1740.
[22] Zhang, Wei, et al. "Learning structure of stereoscopic image for no-reference quality assessment with convolutional neural network." Pattern Recognition 59 (2016): 176-187.
[23] Pan, Cenhui, et al. "Exploiting neural models for no-reference image quality assessment." Visual Communications and Image Processing (VCIP), 2016. IEEE, 2016.
[24] Bianco, Simone, et al. "On the use of deep learning for blind image quality assessment." arXiv preprint arXiv:1602.05531 (2016).
[25] Ghaderi, Amir, and Vassilis Athitsos. "Selective unsupervised feature learning with convolutional neural network (S-CNN)." Pattern Recognition (ICPR), 2016 23rd International Conference on. IEEE, 2016.
[26] Li, Jun-yi, and Jian-hua Li. "Supervised hashing binary code with deep CNN for image retrieval." Biomedical Engineering and Informatics (BMEI), 2015 8th International Conference on. IEEE, 2015.
[27] Liu, Xingang, Kai Kang, and Yinbo Liu. "Stereoscopic Image Quality Assessment Based on Depth and Texture Information." IEEE Systems Journal(2016).
[28] M.Carnec, P.LeCallet, D.Barba, An image quality assessment method based on perception of structural information, in: IEEE International Conference on Image Processing,vol.3,September2003,pp.185–193.
[29] Ma, Lin, et al. "Reorganized DCT-based image representation for reduced reference stereoscopic image quality assessment." Neurocomputing 215 (2016): 21-31.
[30] W.Zhou, G.Jiang, M.Yu, Z.Wang, Z.Peng, F.Shao, Reduced reference stereoscopic image quality assessment using digital watermarking, Comput. Electr. Eng. (2014)104–116.
[31] Mittal, Anish, Anush Krishna Moorthy, and Alan Conrad Bovik. "No-reference image quality assessment in the spatial domain." IEEE Transactions on Image Processing 21.12 (2012): 4695-4708.
[32] P. Ye, J. Kumar, L. Kang, and D. Doermann. Unsupervised feature learning framework for no-reference image quality assessment. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1098–1105, 2012.
[33] P. Ye, J. Kumar, L. Kang, and D. Doermann. Real-time noreference image quality assessment based on filter learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 987–994, 2013.
[34] Fukushima, Kunihiko. "Neocognitron: A hierarchical neural network capable of visual pattern recognition." Neural networks 1.2 (1988): 119-130.
[35] "MatConvNet - Convolutional Neural Networks for MATLAB", A. Vedaldi and K. Lenc, Proc. of the ACM Int. Conf. on Multimedia, 2015.
[36] Otsu, Nobuyuki. "A threshold selection method from gray-level histograms." IEEE transactions on systems, man, and cybernetics 9.1 (1979): 62-66.
[37] LIVE 3Q Image Quality Assessment Database phase1. [Online]. Available: http://live.ece.utexas.edu/research/quality/live_3dimage_phase1.html
[38] LIVE 3Q Image Quality Assessment Database phase2. [Online]. Available: http://live.ece.utexas.edu/research/quality/live_3dimage_phase2.html
[39] Rui Song, Hyunsuk Ko, C. C. Jay Kuo. MCL-3D: a database for stereoscopic image
quality assessment using 2D-image-plus-depth source. Journal of Visual Communication and Image Representation.
[40] Liu, Xingang, Kai Kang, and Yinbo Liu. "Stereoscopic Image Quality Assessment Based on Depth and Texture Information." IEEE Systems Journal(2016).
[41] L. Zhang, D. Zhang, X. Mou, A.C. Bovik, “FSIM: a feature similarity index for image quality assessment.” IEEE transactions on Image Processing 2011
[42] J. You, L. Xing, A. Perkis, X. Wang” Perceptual quality assessment for stereoscopic images based in 2D image quality metrics and disparity analysis.” Proc. of International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Scottsdale, AZ, USA. 2010.
[43] A. Benoit, P. Le Callet, P. Campisi, R. Couseau,” Quality assessment of stereoscopic images.” EURASIP journal on image and video processing 2009.
[44] Bensalma, Rafik, and Mohamed-Chaker Larabi. "A perceptual metric for stereoscopic image quality assessment based on the binocular energy." Multidimensional Systems and Signal Processing (2013): 1-36.
[45] Chen, Ming-Jun, et al. "Full-reference quality assessment of stereopairs accounting for rivalry." Signal Processing: Image Communication 28.9 (2013): 1143-1155.
[46] Lin, Yu-Hsun, and Ja-Ling Wu. "Quality assessment of stereoscopic 3D image compression by binocular integration behaviors." IEEE transactions on Image Processing 23.4 (2014): 1527-1542.
[47] Shao, Feng, et al. "Full-reference quality assessment of stereoscopic images by learning binocular receptive field properties." IEEE Transactions on Image Processing 24.10 (2015): 2971-2983.
[48] Moorthy, Anush K., and Alan C. Bovik. "A two-stage framework for blind image quality assessment." Image Processing (ICIP), 2010 17th IEEE International Conference on. IEEE, 2010.
[49] Moorthy, Anush Krishna, and Alan Conrad Bovik. "Blind image quality assessment: From natural scene statistics to perceptual quality." IEEE transactions on Image Processing 20.12 (2011): 3350-3364.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top