跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.110) 您好!臺灣時間:2026/05/05 22:42
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:孫梓鈞
研究生(外文):Tzu-Chun Sun
論文名稱:使用深度CNN辨識蔬果
論文名稱(外文):Fruit Recognition Using Deep Convolution Neural Network
指導教授:石勝文石勝文引用關係
指導教授(外文):Sheng-Wen Shin
口試委員:洪政欣王家輝顧金福
口試委員(外文):Cheng-Hsin HungChia-Hui WangChin-Fu Ku
口試日期:2014-07-25
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:42
中文關鍵詞:蔬果辨識深度學習神經網路卷積
外文關鍵詞:Fruit RecognitionDeep LearningNeural NetworkConvolution
相關次數:
  • 被引用被引用:10
  • 點閱點閱:3558
  • 評分評分:
  • 下載下載:502
  • 收藏至我的研究室書目清單書目收藏:1
本論文主要探討水果辨識的方法,可應用於提昇超市結帳速度,以增進生活便利性。現有的水果辨識法多是利用手動選取的水果表皮紋理、顏色、外形等影像特徵來作辨識。但是影像特徵的抽取是透過特定的演算法,而這些特徵抽取法不一定能提供充份的資訊供水果辨識。在本論文中,將以DCNN (Deep Convolution Neural Network) 自動學習水果的特徵,以便分辨不同種類的水果。在實驗中測試了許多不同的DCNN設定,以不同的深度與每層節點數,透過訓練與測試,決定可達到最佳辨識效果的 DCNN 結構。為測試以 DCNN 作水果辨識的精確度,我們自製蔬果取像裝置,並用以拍攝大富士、小富士、五爪、青蘋果、木瓜、哈密瓜、香瓜、芭樂、香蕉、香吉士、葡萄柚、蓮霧、檸檬、甜桃和奇異果等 15 種水果的影像。並將資料庫切分成五等份,其中四等份為訓練資料,剩下的一等份為測試資料。為了比較 DCNN 水果辨識法的優劣,我們也實做出現有的兩種方法作為參考基準。實驗數顯示 DCNN 優於現有的那兩種方法,其辨識準確率可達到 92.91% 。
This thesis focuses on developing a fruit recognition method. It can be used to improve life convenience by shortening the supermarket checkout time. Existing methods for fruit recognition use handcrafted image features, such as the texture, the color, and the shape of a fruit, for fruit recognition. However, image features extracted with a set of specific algorithms do not necessarily provide enough information for pattern recognition. In this work, we use deep convolution neural network (DCNN) to learn discriminative fruit features automatically. In order to achieve high recognition accuracy, we tested many different DCNN configurations. DCNNs of different depths and different nodes in each network layers are trained and are tested to determine the best configuration. To test the implemented DCNN fruit recognition method, we collect a fruit image database containing big Fuji apples, small Fuji apples, Washington apples, Granny Smith apples, papayas, Hami melons, muskmelons, guavas, bananas, Sunkist, grapefruit, wax apples, limes, peaches, and kiwis. The fruit images are divided into five parts. Four of them are used for training and the other one is for testing. We have also implemented another two existing fruit recognition methods as baseline methods to compare the recognition results of our DCNN approach. Experimental results show that the DCNN method outperforms the other two methods and its recognition accuracy is about 92.91%.
致謝................................................................................................................. i
論文摘要........................................................................................................ ii
Abstract ......................................................................................................... iii
目次.............................................................................................................. iv
圖目次.............................................................................................................. vi
1.1 The Supermarket Scanner Recognition Machine (圖片來源: [1]) . . . . . . 2
1.2 整合現有方法的流程圖. . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 區分文獻中取像的方式(圖片來源: [2]、[3]) . . . . . . . . . . . . . . . 4
1.4 Supermarket Produce 資料庫(圖片來源: [4]) . . . . . . . . . . . . . . . . 5
1.5 LBP 特徵圖(圖片來源: [2]) . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 單一神經元. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 sigmoid 函數,其取值範圍是在[0; 1]。. . . . . . . . . . . . . . . . . . 13
2.3 神經網路模型。神經網路最左邊的一層叫做輸入層,最右邊的一層叫做輸出層。中間所有節點組成的一層叫做隱藏層(Hidden Layer),因為它的輸出值在此層無法觀察到。此例子有3 個輸入單元(偏置單元不在計算內),3 個隱藏單元以及1 個輸出單元。. . . . . . . . . . . . . . 14
2.4 神經網路訓練過程。. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 模擬大腦皮層組織的機制之架構。圖片來源: [5] . . . . . . . . . . . . . 19
2.6 經過Gabor 濾波的結果。圖片來源: [5] . . . . . . . . . . . . . . . . . . 21
2.7 將四個不同方向的特徵疊在一起,成為輸出。圖片來源: [5] . . . . . . 21
2.8 最大化混合(Max Pooling) 示意圖。主要是平移不變性和保留原始影像
的資訊且提升尺度不變性。圖片來源: [5] . . . . . . . . . . . . . . . . . 22
3.1 水果取像裝置之外觀. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 水果取像裝置之內部. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 測試水果資料庫之內含15 種水果. . . . . . . . . . . . . . . . . . . . . 25
3.4 水果資料庫其中一個大富士之20 張面向圖. . . . . . . . . . . . . . . . 26
3.5 水果資料庫其中一個哈密瓜之20 張面向圖. . . . . . . . . . . . . . . . 26
3.6 輸入影像. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.7 第一層濾波器. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.8 第一層輸出特徵. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.9 第二層濾波器. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.10 第二層輸出特徵. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.11 第三層輸出特徵. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.12 第四層輸出特徵. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.13 第五層輸出特徵. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.14 大富士誤判成小富士的影像,其中上方三張影像是大富士。. . . . . . 35
3.15 香蕉誤判成檸檬的影像. . . . . . . . . . . . . . . . . . . . . . . . . . . 36
表目次........................................................................................................... viii
1.1 蔬果辨識相關研究之影像擷取. . . . . . . . . . . . . . . . . . . . . . . 4
1.2 蔬果辨識相關研究之特徵抽取. . . . . . . . . . . . . . . . . . . . . . . 8
1.3 蔬果辨識相關研究之分類技術. . . . . . . . . . . . . . . . . . . . . . . 9
3.1 CNN 在各層不同的深度之節點數. . . . . . . . . . . . . . . . . . . . . 27
3.2 CNN 深度對測試準確率的影響. . . . . . . . . . . . . . . . . . . . . . . 27
3.3 每一層輸出的特徵節點數. . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 執行五次實驗的測試準確率以及平均測試準確率. . . . . . . . . . . . 28
3.5 每個數字標籤代表的水果種類. . . . . . . . . . . . . . . . . . . . . . . 34
3.6 混淆矩陣,其數字標籤表示如表3.5。. . . . . . . . . . . . . . . . . . 34
3.7 深度CNN 與其他方法做比較. . . . . . . . . . . . . . . . . . . . . . . . 37
第一章緒論........................................................................................................ 1
1.1 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 相關研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 影像擷取. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 影像分割. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 特徵抽取. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 影像辨識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 研究目標. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
第二章研究方法..................................................................................................... 11
2.1 深度學習. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 關深度學習的“深度” . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 神經網路. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 神經網路模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 反向傳播演算法. . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 傳統神經網路和卷積神經網路的差異. . . . . . . . . . . . . . . . . . . 18
2.4 卷積神經網路類似大腦皮層組織的機制. . . . . . . . . . . . . . . . . . 19
第三章實驗結果........................................................................................................23
3.1 實驗平台. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 資料庫. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 Caffe 最佳參數設定. . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.2 深度CNN 擷取特徵的結果. . . . . . . . . . . . . . . . . . . . . 29
3.3.3 深度CNN 辨識結果. . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.4 深度CNN 與其他方法做比較. . . . . . . . . . . . . . . . . . . 37
第四章結論與未來方向.................................................................................................38
4.1 結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 未來方向. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
參考文獻............................................................................................................40

[1] http://www.diginfo.tv/v/12-0033-r-en.php.
[2] H.-D. Chang and Y.-H. Wu, “An implemenation of a real-time fruit recognition system,”
in Proceedings of the International Conference on Advanced Information Technologies,
pp. 1–8, 2009.
[3] Y. Zhang and L. Wu, “Classification of fruits using computer vision and a multiclass
support vector machine,” Journal Sensors, pp. 12489–12505, 2012.
[4] A. Rocha, D. C.Hauagge, J. Wainer, and S. Goldenstein, “Automatic produce classification
from images using color, texture and appearance cues,” in Proceedings of the XXI
Brazilian Symposium on Computer Graphics and Image Processing, pp. 3–10, 2008.
[5] S. B. M. R. Thomas Serre, Lior Wolf and T. Poggio, “Robust object recognition with
cortex-like mechanisms,” in Proceedings of the IEEE Transactions on Pattern Analysis
and Machine Intelligence, pp. 411–426, 2007.
[6] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge
University Press, ISBN: 0521540518, second ed.
[7] S. Arivazhagan, R. N. Shebiah, S. S. Nidhyanandhan, and L. Ganesan, “Fruit recognition
using color and texture features,” Journal of Emerging Trends in Computing and
Information Sciences, pp. 90–94, 2010.
[8] C. S. Woo and H. M. Seyed, “A new method for fruits recognition system,” in Proceedings
of the ICEEI International Conference on Electrical Engineering and Informatics,
pp. 130–134, 2009.
[9] C. Pornpanomchai, K. Srikeaw, V. Harnprasert, and K. Promnurakkit, “Thai fruit recognition
system (tfrs),” in Proceedings of the First International Conference on Internet
Multimedia Computing and Service, pp. 108–112, 2009.
[10] R. M. Bolle, J. Connell, N. Haas, R. Mohan, and G. Taubin, “Veggie version: A produce
recognition system,” in Proceedings of the IEEE Workshop on Automatic Identification
Advanced Technologies, pp. 35–38, 1997.
[11] J. Zhao, J. Tow, and J. Katupitiya, “On-tree fruit recognition using texture properties
and color data,” in Proceedings of the IEEE/RSJ International Conference on Intelligent
Robots and System, pp. 3993–3998, 2005.
[12] M. Lak, S. Minaei, J. Amiriparian, and B. Beheshti, “Apple fruits recognition under natural
luminance using machine vision,” in Proceedings of the International Conference
on Advance Journal of Food Science and Technology, pp. 325–327, 2010.
[13] H. N. Patel, R. Jain, and M. Joshi, “Fruit detection using improved multiple features
based algorithm,” International Journal of Computer Applications, pp. 1–5, 2011.
[14] A. Rocha, D. C. Hauagge, J. Wainer, and S. Goldenstein, “Automatic fruit and vegetable
classification from images,” Journal Computers and Electronics in Agriculture,
pp. 96–104, 2010.
[15] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Prentice Hall, 2002.
[16] T. Ojala, M. Pietikainen, and D. Harwood, “Performance evaluation of texture measures
with classification based on kullback discrimination of distributions,” in Proceedings of
the 12th IAPR International Conference on Computer Vision and Pattern Recognition,
pp. 582–585, 1994.
[17] M. Unser, “Texture classification and segmentation using wavelet frames,” in Proceedings
of the IEEE Transactions on Image Processing, pp. 1549–1560, 1995.
[18] R. R. Honglak Lee, Roger Grosse and A. Y. Ng, “Convolutional deep belief networks
for scalable unsupervised learning of hierarchical representations,” in Proceedings of
the 26th Annual International Conference on Machine Learning, pp. 609–616, 2009.
[19] K. Fukushima and S. Miyake, “Neocognitron: A new algorithm for pattern recognition
tolerant of deformations and shifts in position,” pp. 455–469, 1982.
[20] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face recognition: A convolution
neural-network approach,” in Proceedings of the IEEE Transactions on Neural
Network, pp. 98–113, 1997.
[21] D. Gabor, “Theory of communication,” Journal of the Institution of Electrical Engineers,
pp. 429–459, 1946.
[22] J. Jones and L. Palmer, “An evaluation of the two-dimensional gabor filter model of simple
receptive fields in cat striate cortex,” Journal of Neurophysiology, pp. 1233–1258,
1987.
[23] D. Hubel and T. Wiesel, “Receptive fields, binocular interaction and functional architecture
in the cat’s visual cortex,” Journal of Physiol, pp. 106–154, 1962.
[24] qing Jia Yang, “Caffe: An open source convolutional architecture for fast feature embedding.”
http://caffe.berkeleyvision.org/, 2013.
[25] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, pp. 886–893, 2005.
42
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top