(3.237.97.64) 您好!臺灣時間:2021/03/03 04:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:高聿緯
研究生(外文):Yu-Wei Kao
論文名稱:關聯式學習:利用自動編碼器與目標傳遞法分解端到端倒傳遞演算法
論文名稱(外文):Associated Learning: Decomposing End-to-end Backpropagation based on Auto-encoders and Target Propagation
指導教授:陳弘軒陳弘軒引用關係
指導教授(外文):Hung-Hsuan Chen
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:英文
論文頁數:48
中文關鍵詞:生物合理性演算法深度學習平行運算模組化
外文關鍵詞:Biologically plausible algorithmDeep learningParallel computingModularization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:99
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:19
  • 收藏至我的研究室書目清單書目收藏:0
倒傳遞演算法已被廣泛的運用在深度學習上,但因為有傳遞鎖與梯度
消失、爆炸的問題,它不是有效率且穩定的演算法,在較深的網路架構
更可以觀察到這些現象。此外,單單只用一個目標來更新神經網路中的
參數在生物學來說並非合理的。
在本篇論文中,我們提出了一種新穎且受生物學啟發的學習架構,名
為「關聯式學習」,這個訓練方式將原有的神經網路模組化成小單元,每
個小單元都有自己的局部目標,又因為這些單元兩兩獨立,關聯式學習
能夠獨立且同時訓練彼此獨立的參數。
令人驚訝的是,利用關聯式學習訓練的準確度,也能與直接使用目標
訓練的傳統倒傳遞演算法相當,此外,可能是因為模組內的梯度流較短,
關聯式學習也能訓練用sigmoid 當作活化函數的深度學習網路,然而若
是用倒傳遞演算法訓練這類網路會容易導致梯度消失。
我們也透過觀察隱藏層中的類間與類內距離,以及t-SNE 來呈
現數量上與品質上的結果,發現聯想式學習能夠生成更好的間特徵
(Metafeatures)。
Backpropagation has been widely used in deep learning approaches, but
it is inefficient and sometimes unstable because of backward locking and
vanishing/exploding gradient problems, especially when the gradient flow
is long. Additionally, updating all edge weights based on a single objective
seems biologically implausible. In this paper, we introduce a novel biologically
motivated learning structure called Associated Learning, which
modularizes the network into smaller components, each of which has a local
objective. Because the objectives are mutually independent, Associated
Learning can learn the parameters independently and simultaneously when
these parameters belong to different components. Surprisingly, training
deep models by Associated Learning yields comparable accuracies to models
trained using typical backpropagation methods, which aims at fitting
the target variable directly. Moreover, probably because the gradient flow
of each component is short, deep networks can still be trained with Associated
Learning even when some of the activation functions are sigmoid—a
situation that usually results in the vanishing gradient problem when using
typical backpropagation. We also found that the Associated Learning generates better metafeatures, which we demonstrated both quantitatively
(via inter-class and intra-class distance comparisons in the hidden layers)
and qualitatively (by visualizing the hidden layers using t-SNE).
摘要 iv
Abstract v
Contents vii
List of Figures ix
List of Tables xii
1 Introduction 1
2 Methodology 4
2.1 Preliminaries....................................4
2.1.1 Artificial Neural Network..................... 4
2.1.2 Backpropagation............................... 5
2.1.3 Models........................................ 5
2.2 Motivation...................................... 8
2.3 Associated Loss of Associated Learning.......... 9
2.4 Inverse Loss of Associated Learning............. 10
2.5 Bridge of Associated Learning................... 11
2.6 Effective Parameters and Hypothesis Space....... 11
3 Experiments 13
3.1 Datasets........................................ 14
3.1.1 MNIST......................................... 14
3.1.2 CIFAR ........................................ 16
3.2 Testing Accuracy ............................... 17
3.2.1 MNIST......................................... 17
3.2.2 CIFAR-10 ..................................... 19
3.2.3 CIFAR-100..................................... 20
3.3 Metafeature Visualization and Quantification ... 21
4 Related Work 25
5 Discussion and future works 29
Bibliography 31
A Source Code 34
A.1 Code link ...................................... 34
A.2 Usage........................................... 34
[1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by
back-propagating errors,” Nature, vol. 323, no. 6088, p. 533, 1986.
[2] M. Jaderberg, W. M. Czarnecki, S. Osindero, O. Vinyals, A. Graves, D. Silver,
and K. Kavukcuoglu, “Decoupled neural interfaces using synthetic gradients,” in
Proceedings of the 34th International Conference on Machine Learning-Volume 70,
pp. 1627–1635, JMLR. org, 2017.
[3] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, “Gradient flow in recurrent
nets: The difficulty of learning long-term dependencies,” in A Field Guide to
Dynamical Recurrent Neural Networks (S. C. Kremer and J. F. Kolen, eds.), IEEE
Press, 2001.
[4] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in
Proceedings of the fourteenth International Conference on Artificial Intelligence and
Statistics, pp. 315–323, 2011.
[5] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation,
vol. 9, no. 8, pp. 1735–1780, 1997.
[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 770–778, 2016.
[7] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training
by reducing internal covariate shift,” in Proceedings of the 32Nd International Conference
on International Conference on Machine Learning - Volume 37, ICML’15,
pp. 448–456, JMLR.org, 2015.
[8] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training recurrent
neural networks,” in International Conference on Machine Learning, pp. 1310–1318,
2013.
[9] F. Crick, “The recent excitement about neural networks.,” Nature, vol. 337, no. 6203,
pp. 129–132, 1989.
[10] D. Balduzzi, H. Vanchinathan, and J. M. Buhmann, “Kickback cuts backprop’s
red-tape: Biologically plausible credit assignment in neural networks.,” in AAAI,
pp. 485–491, 2015.
[11] T. P. Lillicrap, D. Cownden, D. B. Tweed, and C. J. Akerman, “Random synaptic
feedback weights support error backpropagation for deep learning,” Nature Communications,
vol. 7, p. 13276, 2016.
[12] A. Nøkland, “Direct feedback alignment provides learning in deep neural networks,”
in Advances in Neural Information Processing Systems, pp. 1037–1045, 2016.
[13] A. G. Ororbia, A. Mali, D. Kifer, and C. L. Giles, “Conducting credit assignment by
aligning local representations,” arXiv preprint arXiv:1803.01834, 2018.
[14] A. G. Ororbia and A. Mali, “Biologically motivated algorithms for propagating local
target representations,” arXiv preprint arXiv:1805.11703, 2018.
[15] S. Bartunov, A. Santoro, B. Richards, L. Marris, G. E. Hinton, and T. Lillicrap,
“Assessing the scalability of biologically-motivated deep learning algorithms and architectures,”
in Advances in Neural Information Processing Systems, pp. 9390–9400,
2018.
[16] A. Nøkland and L. H. Eidnes, “Training neural networks with local error signals.,” in
ICML, vol. 97 of Proceedings of Machine Learning Research, pp. 4839–4850, PMLR,
2019.
[17] Y. Bengio, “How auto-encoders could provide credit assignment in deep networks
via target propagation,” arXiv preprint arXiv:1407.7906, 2014.
[18] D.-H. Lee, S. Zhang, A. Fischer, and Y. Bengio, “Difference target propagation,”
in Joint European Conference on Machine Learning and Knowledge Discovery in
Databases, pp. 498–515, Springer, 2015.
[19] L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of Machine
Learning Research, vol. 9, no. Nov, pp. 2579–2605, 2008.
[20] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition.,” in ICLR (Y. Bengio and Y. LeCun, eds.), 2015.
[21] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied
to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324,
1998.
[22] A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,”
tech. rep., Citeseer, 2009.
[23] S. Sabour, N. Frosst, and G. E. Hinton, “Dynamic routing between capsules,” in
Advances in Neural Information Processing Systems, pp. 3856–3866, 2017.
[24] M. Michael and W.-C. Lin, “Experimental study of information measure and interintra
class distance ratios on feature selection and orderings,” IEEE Transactions on
Systems, Man, and Cybernetics, no. 2, pp. 172–181, 1973.
[25] Y. Luo, Y. Wong, M. Kankanhalli, and Q. Zhao, “G-softmax: Improving intraclass
compactness and interclass separability of features,” IEEE Transactions on Neural
Networks and Learning Systems, 2019.
[26] Y. Bengio, D.-H. Lee, J. Bornschein, T. Mesnard, and Z. Lin, “Towards biologically
plausible deep learning,” arXiv preprint arXiv:1502.04156, 2015.
[27] G. Taylor, R. Burmeister, Z. Xu, B. Singh, A. Patel, and T. Goldstein, “Training
neural networks without gradients: A scalable admm approach,” in International
Conference on Machine Learning, pp. 2722–2731, 2016.
[28] Z. Huo, B. Gu, Q. Yang, and H. Huang, “Decoupled parallel backpropagation with
convergence guarantee.,” in ICML, vol. 80 of Proceedings of Machine Learning Research,
pp. 2103–2111, PMLR, 2018.
[29] Z. Huo, B. Gu, and H. Huang, “Training neural networks using features replay,” in
Advances in Neural Information Processing Systems, pp. 6659–6668, 2018.
[30] H. Mostafa, V. Ramesh, and G. Cauwenberghs, “Deep supervised learning using
local errors,” Frontiers in Neuroscience, vol. 12, p. 608, 2018.
[31] D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix
factorization,” Nature, vol. 401, no. 6755, p. 788, 1999.
[32] R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, “Self-taught learning: transfer
learning from unlabeled data,” in Proceedings of the 24th International Conference
on Machine Learning, pp. 759–766, ACM, 2007.
[33] A. Coates and A. Y. Ng, “Selecting receptive fields in deep networks,” in Advances
in Neural Information Processing Systems, pp. 2528–2536, 2011.
[34] P. Baldi, “Autoencoders, unsupervised learning, and deep architectures,” in Proceedings
of ICML Workshop on Unsupervised and Transfer Learning, pp. 37–49, 2012.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔