跳到主要內容

臺灣博碩士論文加值系統

(44.192.38.248) 您好!臺灣時間:2022/11/27 06:28
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃裕庭
研究生(外文):Yu-Ting Huang
論文名稱:最近特徵線嵌入網路之影像物件辨識系統
論文名稱(外文):Image Object Recognition System Based on Nearest Feature Line Embedding Network
指導教授:范國清范國清引用關係韓欽銓韓欽銓引用關係
指導教授(外文):Kuo-Chin FanChin-Chuan Han
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:中文
論文頁數:51
中文關鍵詞:圖片物件辨識特徵提取最近特徵線嵌入
相關次數:
  • 被引用被引用:0
  • 點閱點閱:110
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
科技日新月異,隨著電腦硬體技術不斷提升,影像辨識的技術也不斷精進,對於電腦來說,分辨一張圖片的內容是一隻狗或一隻貓,已經是一件非常簡單的事情。然而,要達到高精準度的辨識,需要的條件也有許多:具有良好運算能力的GPU、多至數千與數萬筆的資料、訓練與調整參數優化的時間。在人工智慧技術不斷普及的現在,各個不同領域皆需要利用機器學習與深度學習的技術,達到各個領域所需求的效果,但在學術以外或是較為冷門的領域需要結合人工智慧時,資料量不足首當其衝的浮現了。除此之外,許多產業擁有的機器並不需求運算效能優異的GPU輔助訓練。
目前具有高精準度的影像辨識方法主流,通常仍是以CNN為主的架構,因此需要良好的GPU運算能力與一定的訓練時間才能夠成功訓練出來,雖然也有傳統特徵提取方法結合了類神經網路的PCANet,但在此部分仍有相當大的進步空間。本論文將採用與PCANet相似的架構,同樣是使用傳統方式設計濾波器的方式,將PCA的部分替換為最近特徵線策略NFL,NFL的特性為:在資料量少時能夠保持非常不錯的精準度,利用與PCANet相似的架構進行圖片的分析與處理,並使用NFL提取出必要的特徵,在最後使用SVM方法進行圖片的分類,以上是本篇論文的核心。
分析實驗結果得知,在資料量較少,約500~1000筆左右資料的資料數訓練時,NFLENet能夠得到比PCANet高5%~10%的辨識精準度,並因為資料量減少,訓練時間也大幅減少。
With the continuous improvement of computer hardware technology, the technology of image recognition is also constantly improving. For computers, it is a very simple matter to distinguish the content of a picture as a dog or a cat. However, to achieve high-accuracy identification, there are many conditions required: GPUs with good computing power, up to tens of thousands of training data, time of training. Nowadays, with the increasing popularity of artificial intelligence technology, different industries need to use machine learning and deep learning to achieve the desired target. However, when there is a need to combine artificial intelligence in areas other than academic or relatively unpopular, the amount of data is insufficient. In addition, many industry-owned machines do not have GPU-assisted training with superior computing performance.
At present, the mainstream image recognition method with high precision is still CNN-based architecture. It requires good GPU computing power and a certain training time to be successfully trained. Although there are also traditional feature extraction methods combined with PCANet based on neural networks. However, there is still big space for improvement in this section. This paper will use a similar architecture to PCANet, but replace the PCA part with the nearest feature line embedding(NFL). The NFL features a very good accuracy when the amount of data is small, and uses a similar architecture to PCANet for image analysis. It is the core of this paper to deal with and use the NFL to extract the necessary features and to use the SVM method to classify the images.
According to the analysis results, NFLENet can obtain 5%~10% higher recognition accuracy than PCANet when the amount of data is small, about 500 pieces of data training, and the training time is greatly reduced because of the reduced amount of data.
摘要........i
Abstract...ii
目錄..............................iii
圖目錄..............................v
表目錄.............................vi
第一章 緒論........................................................................1
1.1 研究動機.........................................................................1
1.2 研究目的..........................................................................4
1.3 論文架構.................................................................5
第二章 相關文獻...................6
2.1 相關研究.........................6
2.2 主成分分析.....................8
2.3 卷積.................................11
2.4 PCANet..........................12
2.5 線性判別分析..............14
2.6 LDANet & RandNet.............15
2.7 支援向量機.............................................16
2.8 二值化雜湊函式編碼.......................................................17
第三章 研究方法................................................................18
3.1 NFLENet架構.....................................................................18
3.2 Input Layer輸入層..............................................................19
3.3 First Stage 第一次NFL編碼卷積...................................21
3.3.1 取圖片Patch..............................................................21
3.3.2 最近特徵線編碼NFL.....................................................23
3.3.3 卷積Convolution.................................................25
3.4 Second Stage 第二次NFL編碼卷積.....................................25
3.5 Output Layer 特徵輸出層..................................................26
3.5.1 Binary Hashing Encoding & Block-wise Histogram......27
3.5.2 支援向量機SVM..........................................28
第四章 實驗結果..........................................29
4.1 實驗建置環境與資料集介紹...................................29
4.1.1 MNIST手寫資料集.....................................29
4.1.2 CIFAR-10物件辨識資料集.............................30
4.1.3 Extended Yale B人臉資料集..............................32
4.1.4 PubFig人臉資料集.......................33
4.2 實驗說明..................................................35
4.3 實驗數據...........................................................36
4.4 實驗結論........................................................39
第五章 結論與未來展望..............................40
參考文獻..........................................................41
[1] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, Li Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge" arXiv:1409.0575v3, 2015.
[2] A. Krizhevsky, I. Sutskever, and G. Hinton, “Image-net classification with deep convolutional neural network,” in Proc. Of the 25th International Conference on NIPS, vol. 1, 2012, pp. 1097-1105.
[3] Y. M. Guo, Y. Liu, A. Oerlemans, S. Y. Lao, S. Wu and M. S. Lew, “Deep Learning for Visual Understanding: A Review”, Neurocomputing, vol. no. 2015.
[4] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis, “Mastering the game of Go with deep neural networks and tree search”, in Nature, vol. 529 (2016), pp. 484-503.
[5] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, “Caffe: Convolutional Architecture for Fast Feature Embedding”, arXiv:1408.5093, 2014.
[6] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database”, in IEEE Computer Vision and Pattern Recognition, 2009
[7] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going Deeper with Convolutions”, in IEEE Computer Vision and Pattern Recognition, 2015.
[8] K. Simonyan, A. Zisserman, “VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION”, in Proc. of international Conference on Pattern Recognition, ICLR, 2015.
[9] J. Redmon, A. Farhadi, “YOLO9000: Better, faster, stronger. Proceedings”, in 30th IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[10] K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, in Proc. of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.

[11] Tsung-Han Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Yi Ma, “PCANet: A Simple Deep Learning Baseline for Image Classification?”, IEEE Trans on Image Processing, vol. 24, no. 12, pp. 5017-5032, 2015.
[12] J. Shlens, “A Tutorial on Principal Component Analysis”, arXiv:1404.1100v1, 2014.
[13] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, “A Practical Guide to Support Vector Classification”, Technical report, Department of Computer Science, National Taiwan University, July 2003.
[14] K., H. Yang, J. Hsiao, C. Chen, “Deep Learning of Binary Hash Codes for Fast Image Retrieval” in IEEE Computer Vision and Pattern Recognition, 2015.
[15] N. Dalal, B. Triggs, “Histograms of Oriented Gradients for Human Detection” in IEEE Computer Vision and Pattern Recognition, 2005.
[16] Yu-kun Ge, Jia-ni Hu, Wei-hong Deng, “PCA-LDANet: A Simple Feature Learning Method for Image Classification”, 10.1109/ACPR.2017.36
[17] Zi-yong Feng, Lian-wen Jin, Da-peng Tao, Shuang-ping Huang, “DLANet: A manifold-learning-based discriminative feature learning network for scene classification”, Neurocomputing, vol. no. 2015.
[18] Ying-Nong Chen, Cheng-Ta Hsieh, Ming-Gang Wen, Chin-Chuan Han, and Kuo-Chin Fan, “Hyperspectral Image Classification Using a General NFLE Transformation with Kernelization and Fuzzification,” Remote Sensing, vol. 7, no. 11, pp. 14292-14326, 2015.
[19] A. Tharwat, A. Ibrahim, T. Gaber, A. E. Hassanien, “Linear discriminant analysis: A detailed tutorial”, in AI Communications 30(2):169-190, May 2017.
[20] Yi Sun, Xiaogang Wang, Xiaoou Tang, “Hybrid Deep Learning for Face Verification”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1997-2009, 2016.
[21] X. Yi, X. Liu, ”Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition”, IEEE Trans on Image Processing, October 2017.
[22] G. B. de Souza, D. F. da Silva Santos, R. G. Pires, A. N. Marana , J. P. Papa, “Deep Texture Features for Robust Face Spoofing Detection”, IEEE Trans. on Circuits and System II: Express Briefs, October 2017.

[23] N. Y. Almudhahka, M. S. Nixon, J. S. Hare, ”Semantic Face Signatures: Recognizing and Retrieving Faces by Verbal Descriptions”, IEEE Trans on Information Forensics and Security, October 2017.
[24] Athinodoros S. Georghiades, Peter N. Belhumeur, David J. Kriegman, “From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose”, IEEE Trans on Pattern Analysis and Machine Intelligence, 2001.
[25] Kuang-Chih Lee, J. Ho, D. J. Kriegman, “Acquiring Linear Subspaces for Face Recognition under Variable Lighting”, IEEE Trans on Pattern Analysis and Machine Intelligence, 2005.
[26] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-Based Learning Applied to Document Recognition”, Proc of IEEE, November 1998.
[27] A. Krizhevsky, “Convolutional Deep Belief Networks on CIFAR-10”, Auguest 2010., Retrieve from https://www.cs.toronto.edu/~kriz/conv-cifar10-aug2010.pdf
[28] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, “What is the best multi-stage architecture for object recognition,” in ICCV, 2009.
[29] Jiasong Wu, Shijie Qiu, Rui Zeng, Youyong Kong, Lotfi Senhadji, Huazhong Shu, ” Multilinear Principal Component Analysis Network for Tensor Object Classification“, 2017.
[30] Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar, “Attribute and Simile Classifiers for Face Verification”, International Conference on Computer Vision (ICCV), 2009.
[31] 謝正達“應用特徵線為基礎之度量學習框架於身份識別,”國立中央大學 資訊工程學系博士論文, 2017.
電子全文 電子全文(網際網路公開日期:20241223)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top