(35.175.212.130) 您好!臺灣時間:2021/05/15 10:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:王俊焜
研究生(外文):Jun-Kun Wang
論文名稱:應用於影像分類的整體學習與轉移學習之新方法
論文名稱(外文):New approaches of ensemble learning and transfer learning forimage classificaion
指導教授:鄭士康
口試委員:林守德楊奕軒
口試日期:2013-06-06
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電信工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:英文
論文頁數:59
中文關鍵詞:影像分類機器學習整體學習轉移學習字典學習
外文關鍵詞:image classificationmachine learningensemble learningtransfer learningdictionary learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:297
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
這篇碩論發展對照片分類有幫助的機器學習方法。我們針對整體學習與轉移學習這兩個領域各提出了新方法。在整體學習方面,我們使用了一包多向量支撐機辨識室內場景,此方法的概念簡單且實作容易。我們展示出有價值的視窗對室內場景辨識是重要的。根據我們的認知,這個影像特性過去是被忽略的。在轉移學習方面,我們給了一個嶄新的研究方向,找出所有任務都可方享的編碼機制。過去相關的研究都是關於怎麼利用任務的相關性來影響分類器的生成。我們的貢獻是提供一個新的研究方向,即利用任務的相關性來影響特徵的形成。我們提出督導階層式字典學習的架構,其可運作在轉移學習於多類別分類上。整個架構的運作是把經過編碼的特徵當作訊息在區塊間傳遞。分類器和字典透過這訊息來溝通。在我們的演算法中,向前和向後更新可視為在分類和編碼找出一個平衡點。

This thesis develops the machine learning approaches for image classification. Specifically, we consider two paradigms in machine learning, namely ensemble learning and transfer learning. In ensemble learning, we use a bag of
SVMs for indoor scene recognition, which is simple and easily implemented. We show valuable local windows are critical to scene recognition. To our knowledge, this image cue was ignored by the computer vision community previously. In transfer learning, we propose a new research direction that finds the encoding mechanism which all the tasks share. The common of all the related works is that they all deal with "how the task relatedness can affect the model during training". Our contribution is to provide another direction
that deal with "how the task relatedness can affect the features". The supervised hierarchical dictionary learning structure is proposed which works for multiclass classification with transfer learning. The whole architecture works by using encoding feature of training data as "messages" between different blocks. The models and the dictionaries are communicate by passing the messages.
The forward and backward updating in our algorithm can be view as trying to find a balance between classification and encoding.

口試委員會審定書i
致謝ii
中文摘要iii
Abstract iv
Contents v
List of Figures viii
List of Tables x
1 Introduction 1
1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background and related works for image classification 5
2.1 Background for the general image classification . . . . . . . . . . . . . . 5
2.2 Introduction to SVM and bagging . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Notation rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Introduction to SVM . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 Introduction to bagging . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Background and Related works for indoor scene recognition . . . . . . . 12
2.3.1 Object detection and deformable part based model . . . . . . . . 12
2.3.2 Objectness measure . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.3 Related works for indoor scene recognition . . . . . . . . . . . . 14
3 Background and related works for transfer learning 17
3.1 Designing regularization term . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 L1,2 norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.2 L1,1 norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.3 More complex regularization term . . . . . . . . . . . . . . . . . 18
3.2 Optimization approach for transfer learning . . . . . . . . . . . . . . . . 19
3.2.1 Additional models from common latent subspace . . . . . . . . . 20
3.2.2 Dirty model in transfer learning . . . . . . . . . . . . . . . . . . 20
3.2.3 Learning task grouping and overlap in multi-task learning . . . . 21
3.3 Bayesian approach for transfer learning . . . . . . . . . . . . . . . . . . 23
4 The window-based voting approach for indoor scene recognition 24
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Window-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.1 Training stage at window level . . . . . . . . . . . . . . . . . . . 26
4.2.2 Testing stage at window level . . . . . . . . . . . . . . . . . . . 26
4.2.3 Extracting valuable windows . . . . . . . . . . . . . . . . . . . . 27
4.3 Image-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4 Combination of models from window and global level . . . . . . . . . . 29
5 Transfer learning with supervised hierarchical dictionary learning structure 30
5.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Background of encoding and supervised dictionary learning . . . . . . . . 31
5.2.1 Background of encoding . . . . . . . . . . . . . . . . . . . . . . 31
5.2.2 Background of supervised dictionary learning . . . . . . . . . . . 32
5.3 Our approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3.1 Whole picture of our approach . . . . . . . . . . . . . . . . . . . 32
5.3.2 Formulation and optimization details of our approach . . . . . . . 33
5.3.3 Initialization and implementaion detail . . . . . . . . . . . . . . 38
6 Experiments for indoor scene recognition 40
6.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.1.1 Number of SVMs in bagging for window-based and image-based
modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.1.2 Number of testing windows extracted during voting for windowbased
modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.1.3 Combining information from local and global level . . . . . . . . 42
6.1.4 Importance of extracting valuable windows . . . . . . . . . . . . 45
6.2 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7 Experiments for transfer learning 49
7.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.3 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8 Conclusion 53
Bibliography 55

[ADF10] Bogdan Alexe, Thomas Deselaers, and Vittorio Ferrari. What is an object? In CVPR, 2010.
[AEP07] Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. Multitask feature learning. In NIPS, 2007.
[AZ05] Rie Kubota Ando and Tong Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817--1853, 2005.
[BBLP10] Y-Lan Boureau, Francis Bach, Yann LeCun, and Jean Ponce. Learning midlevel features for recognition. In CVPR, 2010.
[BH03] Bart Bakker and Tom Heskes. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research, 4:83--99, 2003.
[Bis06] Christopher M. Bishop. Pattern recognition and machine learning. Springer,2006.
[Bre96] Leo Breiman. Bagging predictors. Machine Learning, 24(2), 1996.
[BRF11] Liefeng Bo, Xiaofeng Ren, and Dieter Fox. Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms. In NIPS, 2011.
[BRF12] Liefeng Bo, Xiaofeng Ren, and Dieter Fox. Unsupervised Feature Learning for RGB-D Based Object Recognition. In ISER, 2012.
[Car97] Rich Caruana. Multitask learning. Machine Learning, 28(1):41--75, 1997.
[DH04] Chris Ding and Xiaofeng He. K-means clustering via principal component analysis. In ICML, 2004.
[DSG+12] Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A.Efros. What makes paris look like paris? SIGGRAPH, 31(4), 2012.
[FCH+08] Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification. Journal of Machine Learning Research, 9, 2008. Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear.
[Fer73] Thomas S. Ferguson. A bayesian analysis of some nonparametric problems. The Annals of Statistics, 1973.
[FGMR10] Pedro F. Felzenszwalb, Ross B. Girshick, David A. McAllester, and Deva Ramanan. Object detection with discriminatively trained part-based models. IEEE Transaction on Pattern Analysis and Machine Intelligence, 32(9), 2010.
[GB84] Eli Gafni and Dimitri Bertsekas. Two-metric projection methods for constrained
optimization. SIAM Journal of Control and Optimization, 22:936--964, 1984.
[GD05] Kristen Grauman and Trevor Darrell. The pyramid match kernel: Discriminative classification with sets of image features. In ICCV, 2005.
[GG05] Thomas L. Griffiths and Zoubin Ghahramani. Infinite latent feature models and the indian buffet process. In NIPS, 2005.
[III09] Hal Daum e III. Bayesian multitask learning with latent hierarchies. In UAI,2009.
[JRSR10] Ali Jalali, Pradeep D. Ravikumar, Sujay Sanghavi, and Chao Ruan. A dirty model for multi-task learning. In NIPS, 2010.
[KGS11] Zhuoliang Kang, Kristen Grauman, and Fei Sha. Learning with whom to share in multi-task feature learning. In ICML, 2011.
[KI12] Abhishek Kumar and Hal Daum e III. Learning task grouping and overlap in multi-task learning. In ICML, 2012.
[KV94] Michael J. Kearns and Umesh V. Vazirani. An introduction to Computational
Learning Theory. The MIT Press, 1994.
[LLAN06] Su-In Lee, Honglak Lee, Pieter Abbeel, and Andrew Y. Ng. Efficient L1 regularized logistic regression. In AAAI, 2006.
[Low04] David G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 2004.
[LPC12] Congcong Li, Devi Parikh, and Tsuhan Chen. Automatic discovery of groups of objects for scene understanding. In CVPR, 2012.
[LSP06] Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006.
[LSXL10] Li-Jia Li, Hao Su, Eric P. Xing, and Fei-Fei Li. Object bank: A high-level image representation for scene classification & semantic feature sparsification.
In NIPS, 2010.
[Mai] Julien Mairal. Sparms (sparse modeling software). Software available at http://spams-devel.gforge.inria.fr/index.html.
[MBP+08] Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro, and Andrew Zisserman. Supervised dictionary learning. In NIPS, 2008.
[MBP12] Julien Mairal, Francis Bach, and Jean Ponce. Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell., 34(4):791--804, 2012.
[OT01] Aude Oliva and Antonio Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 2001.
57
[PL11] Megha Pandey and Svetlana Lazebnik. Scene recognition and weakly supervised object localization with deformable part-based models. In ICCV,
2011.
[POF12] Sobhan Naderi Parizi, John G. Oberlin, and Pedro F. Felzenszwalb. Reconfigurable models for scene recognition. In CVPR, 2012.
[PRWI12] Alexandre Passos, Piyush Rai, Jacques Wainer, and Hal Daum e III. Flexible modeling of latent task structures in multitask learning. In ICML, 2012.
[QCD08] Ariadna Quattoni, Michael Collins, and Trevor Darrell. Transfer learning for image classification woth sparse prototype representations. In CVPR, 2008.
[QT09] Ariadna Quattoni and Antonio Torralba. Recognizing indoor scenes. In CVPR, 2009.
[Sch] Mark Schmidt. minfunc. Software available at http://www.di.ens.fr/~mschmidt/Software/minFunc.html.
[SF11] Mohammad A. Sadeghi and Ali Farhadi. Recognition using visual phrases. In CVPR, 2011.
[SGE12] Saurabh Singh, Abhinav Gupta, and Alexei A. Efros. Unsupervised discovery of mid-level discriminative patches. In ECCV, 2012.
[SS01] Bernhard Schlkopf and Alexander J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. The MIT Press, 2001.
[Tro06] Joel A. Tropp. Algorithms for simultaneous sparse approximation. part ii:
Convex relaxation. Signal Processing, 86(3):589--602, 2006.
[Vap98] Vladimir Vapnik. Statistical learning theory. Wiley, 1998.
[WMM+] John Wright, Yi Ma, Julien Mairal, Guillermo Sapiro, Thomas S. Huang, and Shuicheng Yan. Sparse representation for computer vision and pattern recognition.
[WR11] Jianxin Wu and James M. Rehg. Centrist: A visual descriptor for scene categorization. IEEE Transaction on Pattern Analysis and Machine Intelligence, 33(8), 2011.
[WYY+10] Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas S. Huang, and Yihong Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.
[YL06] Ming Yuan and Yi Lin. Model selection and estimation in regression with grouped variables. Journal of Royal Statistical Society Series, 68(49-67), 2006.
[ZCY11] Jiayu Zhou, Jianhui Chen, and Jieping Ye. Clustered multi-task learning via alternating structure optimization. In NIPS, 2011.
[ZHD+02] Hongyuan Zha, Xiaofeng He, Chris Ding, Ming Gu, and Horst Simon. Spectral relaxation for k-means clustering. In NIPS, 2002.
[ZLLX10] Jun Zhu, Li-Jia Li, Fei-Fei Li, and Eric P. Xing. Large margin learning of upstream scene understanding models. In NIPS, 2010.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top