跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.134) 您好!臺灣時間:2025/11/19 23:19
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:黃玟勝
研究生(外文):HUANG,WEN-SHENG
論文名稱:應用於影像檢索之基於深度學習與視覺關注度之哈希碼生成
論文名稱(外文):Hash code generation based on deep learning and visual attention model for image retrieval
指導教授:沈岱範林國祥林國祥引用關係
指導教授(外文):SHEN,DAY-FANNLIN,GUO-SHIANG
口試委員:賴文能林春宏
口試委員(外文):LIE,WEN-NUNGLIN,CHUEN-HORNG
口試日期:2018-07-18
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:電機工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:中文
論文頁數:102
中文關鍵詞:物件偵測關注度深度哈希
外文關鍵詞:object detectionvisual attentionhash code
相關次數:
  • 被引用被引用:0
  • 點閱點閱:329
  • 評分評分:
  • 下載下載:78
  • 收藏至我的研究室書目清單書目收藏:1
本論文研發一套應用於影像檢索之基於深度學習與視覺關注度的哈希碼 (hash code) 生成系統。此套哈希碼生成系統包含幾個部分: 物件偵測、顯著性計算和哈希碼生成。在物件偵測部分,本論文使用深度學習技術 Faster R-CNN 偵測與識別影像內的物件。基於預訓練模型,本系統可以偵測與識別影像中20類的物件。在顯著性計算部分,本論文使用預訓練的靜態顯著性模型 [26],計算得到輸入影像之顯著圖。基於物件偵測與顯著圖,本系統可以求得影像內每個物件的關注度。基於每個物件的類別資訊和關注度,本系統可以產生對應於輸入影像的哈希碼。
為了評估系統效能,本論文使用 PASCAL VOC 影像資料庫進行實驗。此PASCAL VOC 影像資料庫有20類共27088張影像。對於影像檢索部分,本系統之nDCG數值 (72.5%) 高於文獻[29]之nDCG數值 (66.2%)。實驗結果顯示,本系統的影像檢索效能優於文獻[29]。

關鍵字:物件偵測、關注度、深度哈希

In this thesis, we develop a hash code generation based on deep learning and visual attention model for image retrieval. This system is composed of several parts: object detection, saliency computation, and hash code generation. In the object detection part, a deep learning technique, Faster R-CNN, is used to detect and classify objects in images. A pre-trained model can classify 20 categories for image classification. In the saliency computation, the pre-training model proposed in [26] is to compute the saliency value of each object. According to category information and saliency value, the proposed system can generate the corresponding hash code.
To evaluate the performance of the proposed system, the PASCAL VOC image set is used. There are 27088 images in the PASCAL VOC image set. For image retrieval, the nDCG value of the proposed system is higher than that of [29]. Experimental results show that the proposed system is superior to the existing method [29].

Keywords: object detection, visual attention, hash code

目錄
摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
表目錄 vii
圖目錄 ix
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機與目標 1
1.3 章節概述 3
第二章 背景簡介與文獻回顧 4
2.1 基礎CNN介紹 4
2.1.1 Convolution Layer卷積層 4
2.1.2 線性整流單元(Rectified Linear Unit,ReLU) 5
2.1.3 Pooling Layer池化層 5
2.1.4 Full connected layer全連接層 6
2.2 著名CNN網路架構 7
2.2.1 AlexNet 7
2.2.2 VGGNet 8
2.3 物件偵測文獻 11
2.3.1 YOLO 13
2.3.2 Faster Region Convolutional Neural Network(Faster R-CNN) 14
2.4 視覺關注研究 19
2.5影像檢索 20
第三章 整體系統架構描述 23
第四章 類別關注度 25
4.1 SALIENCY NET 25
4.1.1 網路架構 25
4.1.2 損失函數 26
4.2 關注度排名 27
4.3 本章結論 28
第五章 哈希碼生成與檢索 29
5.1 HASHING NET 29
5.1.1 網路架構 29
5.2 哈希碼生成 30
5.2.1 類別標籤 31
5.2.2 相似矩陣 32
5.2.3 損失函數 35
5.2.4 參數學習 36
5.3影像檢索 37
5.4 本章結論 37
第六章 實驗結果與性能評估 38
6.1 實驗平台規格 38
6.2 實驗數據集 38
6.3 檢索結果性能評估指標 39
6.3.1 召回率(Recall Rate)和準確率(Precision Rate) 39
6.3.2 AP和mAP (Mean Average Precision) 40
6.3.3 CG、DCG和nDCG(Discounted cumulative gain) 42
6.4 視覺關注度 44
6.5 系統分析 47
6.6影像檢索之性能測試 48
6.6.1 主觀評估 49
6.6.2 客觀評估 82
6.6.3影像檢索速度 85
6.7 本章結論 85
第七章 結論與未來展望 86
7.1 結論 86
7.2 未來研究方向 86
參考文獻 87


參考文獻
[1]S. Y. Bao, Y. Xiang, and S. Savarese, “Object co-detection,” European Conf. Computer Vision (ECCV), pp.86-101, 2012.
[2]X. Guo, D. Liu, B. Jou, M. Zhu, A. Cai, and S.-F. Chang, “Robust object co-detection,” IEEE Conference on Computer Vision and pattern Recognition, pp. 3206-3213, 2013.
[3]K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” in IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), Vol.37, No.9, Sept. 1 2015,pp. 1904 – 1916.
[4]J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders, “Segmentation as selective search for object recognition,” International Conference on Computer Vision(ICCV),2011,pp. 1879 – 1886
[5]R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2014,pp. 580 – 587.
[6]R. Girshick, “Fast R-CNN,” in Proc. IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440-1448
[7]S. Ren, K. He, R. Girshick , and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No.6 ,pp. 1137 – 1149,2017.
[8]Redmon, J., Divvala, S., Girshick, R., Farhadi, A., “You only look once: Unified, real-time object detection.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779 – 788,2016.
[9]Kaiming He, Georgia Gkioxari, Piotr Doll´ar, Ross Girshick, “Mask R-CNN” ,Technical report,2017
[10]Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg, “SSD: Single Shot MultiBox Detector” in European Conference on Computer Vision (ECCV), pp 21-37 ,2016.
[11]劉翃睿,“基於深度學習之影像物件偵測系統 - 海洋漁業監測 ,2017.
[12]W. Wang, J. Shen, L. Shao, and F. Porikli, “Correspondence driven saliency transfer,” IEEE Transactions on Image Processing, vol. 25,no. 11, pp. 5025–5034, 2016.
[13]Y. Wei, F. Wen, W. Zhu, and J. Sun, “Geodesic saliency using back- ground priors,” in European Conference on Computer Vision, 2012, pp. 29–42.
[14]W. Zhu, S. Liang, Y. Wei, and J. Sun, “Saliency optimization from robust background detection,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2814–2821.
[15]R. Zhao, W. Ouyang, H. Li, and X. Wang, “Saliency detection by multi- context deep learning,” in IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1265–1274.
[16]G. Li and Y. Yu, “Visual saliency based on multiscale deep features,” in IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 5455–5463.
[17]J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440, 2015.
[18]Y. Tang and X. Wu, “Saliency detection via combining region-level and pixel-level predictions with cnns,” in European Conference on Computer Vision, pp. 809–825, 2016.
[19]L. Wang, L. Wang, H. Lu, P. Zhang, and X. Ruan, “Saliency detection with recurrent fully convolutional networks,” in European Conference on Computer Vision, pp. 825–841, 2016.
[20]N. Liu, and J. Han, “DHSnet: Deep hierarchical saliency network for salient object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 678–686, 2016.
[21]X. Li, L. Zhao, L. Wei, M.-H. Yang, F. Wu, Y. Zhuang, H. Ling and J. Wang, “DeepSaliency: Multi-task deep neural network model for salient object detection,” in IEEE Transactions on Image Processing, vol. 25, no. 8, pp. 3919–3930, 2016.
[22]Rongkai Xia, Yan Pan, Hanjiang Lai, Cong Liu, and Shuicheng Yan. “Supervised Hashing for Image Retrieval via Image Representation Learning.” Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.
[23]Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. “Simultaneous Feature Learning and Hash Coding with Deep Neural Networks.” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[24]Guosheng Lin, Chunhua Shen, Qinfeng Shi, Anton van den Hengel, and David Suter, “Fast supervised hashing with decision trees for high-dimensional data.” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1971–1978, 2014.
[25]Peichao Zhang, Wei Zhang, Wu-Jun Li, and Minyi Guo, “Supervised hashing with latent factor models.” in SIGIR, pages 173–182, 2014.
[26]Wenguan Wang, Jianbing Shen, and Ling Shao, “Video Salient Object Detection via Fully convolutional network.” IEEE Transactions on Image Processing, vol. 27, pp. 38-49, 2018.
[27]Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, and Chu-Song Chen. “Deep Learning of Binary Hash Codes for Fast Image Retrieval.” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[28]Haomiao Liu, Ruiping Wang, Shiguang Shan, and Xilin Chen. “Deep Supervised Hashing for Fast Image Retrieval.” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[29]Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. “Feature Learning based Deep Supervised Hashing with Pairwise Labels.” IJCAI Computer Vision and Pattern Recognition (cs.CV), 2016
[30]Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “Imagenet classification with deep convolutional neural networks,” Proceeding NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems, Volume 1, Pages 1097-1105, 2012.
[31]Karen Simonyan, and Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Computer Vision and Pattern Recognition (cs.CV), 2015.
[32]Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg, “SSD: Single Shot MultiBox Detector” in European Conference on Computer Vision (ECCV), pp 21-37 ,2016.
[33]Jun Wang, Sanjiv Kumar, and Shih-Fu Chang, “Sequential projection learning for hashing with compact codes,” in ICML, pages 1127–1134, 2010.
[34]Guosheng Lin, Chunhua Shen, David Suter, and Anton van den Hengel, “A general two-step approach to learning-based hashing,” in ICCV, pages 2552–2559, 2013.
[35]Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin and Shuicheng Yan, “HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval,” in Computer Vision and Pattern Recognition (cs.CV), 2017.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top