跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.172) 您好!臺灣時間:2025/09/12 03:10
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:楊凱崴
研究生(外文):Yang, Kai-Wei
論文名稱:基於度量之元學習於少樣本問題之研究
論文名稱(外文):Novel Metric-based Meta Learning Algorithms for Few-shot Learning
指導教授:劉建良劉建良引用關係
指導教授(外文):Liu, Chien-Liang
口試委員:陳勝一巫木誠
口試委員(外文):Chen, Sheng-IWu, Muh-Cherng
口試日期:2019-08-05
學位類別:碩士
校院名稱:國立交通大學
系所名稱:工業工程與管理系所
學門:工程學門
學類:工業工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:英文
論文頁數:45
中文關鍵詞:少樣本學習元學習度量學習原型網絡
外文關鍵詞:few-shot learningmeta learningmetric learningprototypical networks
相關次數:
  • 被引用被引用:0
  • 點閱點閱:649
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
深度學習技術已經被驗證在文字探勘、電腦視覺等領域比起傳統的機器學習方法能大幅提昇預測準確率,因為深度學習將特徵學習與模型預測整合在同一個網路架構內,透過深度類神經網路直接從資料中學習到抽象化的特徵表示法,可讓後續的分類器提升準確度。雖然深度學習已經被廣泛應用於各領域,但是深度學習需要大量的標記資料去訓練模型,因為深度學習大多透過深度類神經網路架構學習特徵表示法,模型參數數量非常龐大,當標記資料的樣本數少、受限時,深度學習很難練出好的模型做精準的預測,面臨模型過度匹配 (Overfitting) 問題,即模型對訓練資料(Training Data)過度匹配,但無法適用於預測資料(Testing Data)。 轉移學習 (Transfer Learning) 雖然可以稍微解決該問題,但是轉移學習需要 Source Domain 跟 Target Domain 具備某個程度的相關性,同時也需要Source Domain 具備足夠的訓練資料,才能用少量的 Target 訓練資料微調預先訓練的模型 (Pre-trained Model),在使用上還是受到非常大的侷限。少樣本學習(Few-shot Learning) 目前在深度學習領域受到相當多的矚目,因為不少實務上的應用情境下,標記資料非常稀少,難以取得,甚至只有一筆資料 (One-shot Learning)的特殊情形,這時一般的模型不適用於這類問題。度量學習(Metric Learning) 跟元學習(Meta Learning)是目前廣泛應用於少樣本問題的方法。本研究提出兩個新的基於度量的元學習演算法於少樣本問題,是從Prototypical Networks啟發而來。Prototypical Networks 是把資料做特徵萃取,投射到合適的向量空間後,將同類別的資料向量取平均得到群中心(Class-mean)作為該類別的``原型''(Prototype),再做後續訓練及分類等流程; 提出的其中一個方法是不以群中心代表各個類別,而是以每個"資料點"去跟預測的點計算相似度,另一個方法則是結合``原型''跟"資料點"兩種觀點來計算相似度,再做後續的訓練與預測。
Deep learning technology has been proven to significantly improve prediction accuracy in the areas of text mining, computer vision, and signal processing. Compared to traditional machine learning methods, deep learning integrates feature learning and model prediction into the same network architecture, making it possible to learn abstract feature representations directly from data. Deep learning requires a large amount of training data to train the model, since deep learning relies on deep neural networks to learn feature representation, and the number of model parameters is always enormous. Although transfer learning can deal with this problem slightly, it requires a certain degree of adaptation between source domain and target domain, and also requires sufficient training data from source domain to learn the pre-trained model. Few-shot Learning is another approach that could be used to deal with aforementioned problem, and has attracted much attention in recent years. This thesis proposes two novel metric-based meta-learning algorithms for few-shot learning problem. The proposed methods combine meta learning and metric learning, giving a base to learn a model that could deal with the problems where only a few data samples are available for each class, and could predict the data coming from unseen class. We will conduct experiments to compare with several alternatives to evaluate our proposed methods.
1 Introduction 1
1.1 Background and Motivation 1
1.2 Research Aims 2
2 Related Work 4
2.1 Introduction of Few-shot Classification 4
2.2 Deep Metric Learning 4
2.2.1 Siamese Network 5
2.2.2 Triplet Network 6
2.3 Meta Learning 8
2.3.1 General Architecture of Meta Learning 9
2.3.2 Metric-based Meta Learning Algorithms 12
3 Methodology 17
3.1 Specification of Meta Learning 17
3.2 Specification of Metric-based Meta Learning 18
3.3 The Proposed Methods 19
3.3.1 Algorithm 19
3.3.2 Connection to Prototypical Networks 21
3.4 Model Design 22
3.4.1 Episode Composition 22
3.4.2 Model Architecture 23
4 Experiments 27
4.1 Dataset 27
4.2 Evaluation Metric 27
4.3 Experimental Settings 28
4.4 Experimental Procedure 31
4.5 Experimental Results 32
5 Discussion 36
5.1 Comparison of ProtoNet, PointNet and PointProtoNet 36
5.2 Comparison of Different Prediction Methods 38
5.3 Effectiveness of Increasing the Number of Shot 41
6 Conclusions and Future Work 42
References 44
[1] Wei-Yu Chen et al. “A Closer Look at Few-shot Classification”. In: International Conference on Learning Representations. 2019.
[2] Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. “Siamese neural networks for one-shot image recognition”. In: ICML Deep Learning Workshop. Vol. 2. 2015.
[3] Sumit Chopra, Raia Hadsell, and Yann LeCun. “Learning a similarity metric discriminatively, with application to face verifi cation”. In: null. IEEE. 2005, pp. 539–546.
[4] Elad Hoff er and Nir Ailon. “Deep metric learning using triplet network”. In: International Workshop on Similarity-Based Pattern Recognition. Springer. 2015, pp. 84–92.
[5] Oriol Vinyals et al. “Matching networks for one shot learning”. In: Advances in neural information processing systems. 2016, pp. 3630–3638.
[6] Jake Snell, Kevin Swersky, and Richard Zemel. “Prototypical networks for few-shot learning”. In: Advances in Neural Information Processing Systems. 2017, pp. 4077–4087.
[7] Flood Sung et al. “Learning to compare: Relation network for few-shot learning”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, pp. 1199–1208.
[8] Adam Santoro et al. “Meta-learning with memory augmented neural networks”. In: International conference on machine learning. 2016, pp. 1842–1850.
[9] Tsendsuren Munkhdalai and Hong Yu. “Meta networks”. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org. 2017, pp. 2554–2563.
[10] Sachin Ravi and Hugo Larochelle. “Optimization as a model for few-shot learning”. In: (2016).
[11] Chelsea Finn, Pieter Abbeel, and Sergey Levine. “Model-agnostic meta-learning for fast adaptation of deep networks”. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org. 2017, pp. 1126–1135.
[12] Lilian Weng. “Meta-Learning: Learning to Learn Fast”. In: (2018). url: https: //lilianweng.github.io/lil-log/2018/11/30/meta-learning.html.
[13] Alex Krizhevsky, Ilya Sutskever, and Geoff rey E Hinton. “Imagenet classification with deep convolutional neural networks”. In: Advances in neural information pro-
cessing systems. 2012, pp. 1097–1105.
[14] Sergey Ioff e and Christian Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift”. In: arXiv preprint arXiv:1502.03167 (2015).
[15] Ronald A Fisher. “The use of multiple measurements in taxonomic problems”. In: Annals of eugenics 7.2 (1936), pp. 179–188.
[16] Olga Russakovsky et al. “Imagenet large scale visual recognition challenge”. In: International journal of computer vision 115.3 (2015), pp. 211–252.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top