( 您好!臺灣時間:2021/03/08 14:59
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):Yi-Ting (Evelyn) Tsai
論文名稱(外文):Cloud-Based Mobile Platform for Chinese Food Identification and Menu Character Recognition
外文關鍵詞:food recognitionfeature descriptorsclassificationOCRcloud computingmenu character
  • 被引用被引用:2
  • 點閱點閱:413
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
外國旅客來到亞洲地區國家時,第一重要的事情就是「吃」。外國旅客無法馬上辨認出食物也看不懂菜單上的文字,而這會使旅客感到困惑。近年來因健康意識的抬頭以及飲食與營養的密切關係,使得食物辨識技術受到越來越多的關注。而隨著個人手持電腦裝置像是智慧型手機或是平板電腦的普遍性增加,將食物辨識系統放入手機的應用程式中是個解決上述問題的方案。本論文設計與開發出一套可透過截取特徵點辨識食物及辨別中文菜單上文字的系統。食物辨識利用Gabor濾波器、顏色、尺度不變特徵轉換(SIFT)與局部二元模式(LBP)並結合稀疏編碼(Sparse Coding)四種特徵法用來描述一個特定的食物。 再對於每種特徵建立支持向量機分類器(SVM classifier),結合自適應增技術(Adaboost)去整合所有的弱分類,而形成強分類器。我們建立了一個含有67種食物的資料庫,每個食物有100張從網路與相簿收集的圖片。另外,我們利用Google 的Tesseract光學字元辨識(OCR)來整合菜單辨識。由於中文菜名主要以烹飪、主料、配料和刀法來命名,主要的成分通常都會放在後面。我們創立一個由繁體中文的食物名稱所構成的語義網絡之語言模型。為了提升準確性,我們將以中文菜單命名法來給予每道菜每個字元不一樣的評分方式,使得結果顯示相同類型的菜色。因此我們建立一個資料庫,這裡面含有123 種以上的菜名,及每個字元所擁有的比重與每個字元的關聯性。除此之外,會提供中英文有關此菜色的介紹,包括起源、食材、烹調方式並營養成分分析。使用者測試結果顯示,食物辨識系統運算比Google Image快上二倍的時間,而菜單辨別系統比Google Translate快上一倍的時間(t = 2.45)。食物辨識與菜單辨識的使用者滿意度分別為80.45%與 83.41%。最後,本文為了加速食物辨識部分的計算時間利用了雲端運算系統與平行運算。

Foreigners in Chinese-speaking countries of Asia often face a problem with eating out. They neither recognize the foods nor can read the menus about them, making first-timers confused about what to eat and what they are eating. Food recognition is a topic of research that has received increasing attention due to the rising concern for health and its cause-and-effect relationship to nutrition and diet. A computer-aided tool for food recognition that can allow people to know what they are eating is a proper solution to this issue. Here we propose a system that identifies a food item by its characteristic features, and is also able to recognize Chinese words on menus. It utilizes SIFT and local binary pattern with sparse coding, Gabor and color features as descriptors for a particular food item. We use SVM classifier for training each feature. Adaboost algorithm is applied to perform evaluations of each feature or descriptor of a food item and to assign a weight to that feature. A data bank of pictures of 67 food items, at least 100 images for each of them and collected from internet search or direct photographing of our meals, is constructed for system testing. Another function of menu character recognition is achieved with Google’s Tesseract optical character recognition (OCR), which analyzes and extracts texts from images and compares them to the names of food items in the data bank to find the best match. Chinese dishes are often named according to the style of cooking, the primary and ingredients used, and other descriptive words; in addition; the main ingredient is often placed at the end of a name. Therefor we also incorporate a language model composed of a semantic network of names of foods in Traditional Chinese. To increase accuracy, we designate differing weights to each character in a dish name according to the naming pattern in Chinese dishes. Our databank contains more than 123 dish names with differing weights and semantic relations between each character in the names. The recognition results will show the results of recognition of food image or food name; in addition, we will also show the food’s ingredients, nutrition information (calories, vitamins, lipids), cooking style, in both English and Chinese languages. The results show that computation time for recognition in both food and menu is two to three times less using our system, compared to that required using Google Image and Google Translate (t = 2.45). Overall user satisfaction is 80.45% for our food recognition system and 83.41% for menu recognition system. Finally, this system uses cloud computing and parallel computing to accelerate computation on a mobile platform.

致謝 3
中文摘要 4
Abstract 5
List of Figures 9
List of Tables 10
Chapter 1. Introduction 11
1.1 Background and Motivation 11
1.2 Proposed Solution 12
1.3 Contribution 13
1.4 Thesis Organization 14
Chapter 2.Related Work 15
2.1 Food Recognition 15
2.2 Character Recognition 18
2.2.1. Rules of Naming Chinese Menu 18
2.2.2. Optical Character Recognition 18
2.2.3. Tesseract OCR 19
Chapter 3.Framework 23
Chapter 4.Food Identification 25
4.1 Data Collection 25
4.2 Feature Extraction 27
4.2.1. SIFT with Sparse Coding 27
4.2.2. Local Binary Patterns with Multi-resolution Sparse Coding 29
4.2.3. Color Histograms 30
4.2.4. Gabor Texture 31
4.3 Multi-class Classification 32
Chapter 5.Supplementation 35
5.1 Chinese Character Recognition 35
5.1.1. Image Pre-Processing 35
5.1.2. Optical Character Recognition 36
5.2 Cloud Computing 37
5.2.1. System Architecture 37
Chapter 6.Experiment 38
6.1 User Interface 38
6.2 Food Recognition 42
User Study 1: Evaluation of System Performance 44
User Study 2: A Comparison of Food Search by Brute Force and by Our System 47
6.3 Menu Recognition 49
User Study 1: Evaluation of System Performance 50
User Study2: A Comparison of Food Search by Brute Force and by Our System 52
6.4 Parallel Computing 55
Chapter 7.Conclusion 56
Chapter 8.Bibliography 58

AGRICULTURE, U.S.D. OF, 2012. United States Department of Agriculture. USDA ChooseMyPlate.gov.
BISSACCO, A., CUMMINS, M., NETZER, Y. AND NEVEN, H., 2013. PhotoOCR: Reading Text in Uncontrolled Conditions. 2013 IEEE International Conference on Computer Vision, pp.785–792.
BOSCH, M., ZHU, F., KHANNA, N., BOUSHEY, C.J. AND DELP, E.J., 2011. Combining Global And Local Features For Food Identification In Dietary Assessment. IEEE, pp.1789–1792.
BREUEL, T.M., 2008. The OCRopus Open Source OCR System. International Society for Optics and Photonics, 6815, p.68150F–68150F.
CHAKRAVARTI, R. AND MENG, X., 2009. A Study of Color Histogram Based Image Retrieval. 2009 Sixth International Conference on Information Technology: New Generations, pp.1323–1328.
CHEN, M.-Y. ET AL., 2012. Automatic Chinese food identification and quantity estimation. SIGGRAPH Asia 2012 Technical Briefs on - SA ’12, pp.1–4.
FAHMY, M., 2009. Travel Picks: 10 of world’s most unusual foods | Reuters.
FREUND, Y. AND SCHAPIRE, R.E., 1995. A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting. Computational learning theory, pp.23–37.
HART, S.G., CALIFORNIA, M.F. AND STAVELAND, L.E., 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research,
HOASHI, H., JOUTOU, T. AND YANAI, K., 2010. Image Recognition of 85 Food Categories by Feature Fusion. 2010 IEEE International Symposium on Multimedia, pp.296–301.
JAIN, A.K. AND VAILAYA, A., 1995. Image Retrieval using Color and Shape. Pattern Recognition, 29(8), pp.1233–1244.
KITAMURA, K., SILVA, C. DE, YAMASAKI, T. AND AIZAWA, K., 2010. Image Processing Based Approach To Food Balance Analysis For Personal Food Logging. IEEE, pp.625–630.
KITAMURA, K., YAMASAKI, T. AND AIZAWA, K., 2008. Food log by analyzing food images. Proceeding of the 16th ACM international conference on Multimedia - MM ’08, p.999.
LIKERT, R., 1932. A Technique for the Measurment of Attitudes. Archives of Psychology, 140, pp.1–55.
LOWE, D.G., 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2), pp.91–110.
OJALA, T., PIETIKAINEN, M., MEMBER, S., IEEE AND MAENPAA, T., 2002. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactions on patten analysis and machine intelligence, 24(7), pp.971–987.
RICE, S. V., JENKINS, F.R. AND NARKER, T., 1995. The Fourth Annual Test of OCR Accuracy. Technical Report, 3.
ROSA, L., 2003. Variance of Vector Elements.
SHAPIRO, S.C. AND REPAPORT, W.J., 1992. The SNePS family. Computers &; Mathematics with Applications, 23, pp.243–175.
SMITH, R., 2007. An Overview of the Tesseract OCR Engine. ICDAR, 7, pp.629–633.
SMITH, R., 2007. An Overview of the Tesseract OCR Engine. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, pp.629–633.
SMITH, R., 1987. Computer Processing of Line Images: A Survey. Pattern Recognition, 20, pp.7–15.
SMITH, R., ANTONOVA, D. AND LEE, D., 2009. Adapting the Tesseract Open Source OCR Engine for Multilingual OCR. MOCR: Proceedings of the International Worksop on Multilingual OCR, pp.1–8.
YANAI, K. AND JOUTOU, T., 2008. A Food Image Recognition System With Multiple Kernel Learning. IEEE Image Processing (ICIP), pp.285–288.
YANG, J., YU, K., GONG, Y. AND HUANG, T., 2009. Linear Spatial Pyramid Matching Using Sparse Coding for Image Classificatio. IEEE, pp.1794–1801.
YANG, S. (LYNN), CHEN, M., POMERLEAU, D. AND SUKTHANKAR, R., 2010. Food recognition using statistics of pairwise local features. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2249–2256.
ZHU, F., BOSCH, M., KHANNA, N., BOUSHEY, C.J. AND DELP, E.J., 2011. Multilevel Segmentation for Food Classification in Dietary Assessment. Proc Int Symp Image Signal Process Anal, pp.337–342.
ZHU, J., ARBOR, A. AND HASTIE, T., 2006. Multi-class AdaBoost. Statistics and Its.
ZHU, J., ZOU, H., ROSSET, S. AND HASTIE, T., 2009. Multi-class AdaBoost. Statistics and Its Interface, 2(3), pp.349–360.
周桂英, 2008. 中&;#22269;菜的命名理据及翻&;#35793;策略. &;#37073;州航空工&;#19994;管理&;#23398;院&;#23398;&;#25253;(社&;#20250;科&;#23398;版), pp.117–118.

第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔