跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.172) 您好!臺灣時間:2024/12/13 23:37
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:簡嘉緯
研究生(外文):JIA-WEI JIEN
論文名稱:深度學習的均場退火與梯度遞減的混合模式
論文名稱(外文):Deep learning of mean field annealing and gradient descent methods
指導教授:吳建銘吳建銘引用關係
指導教授(外文):Jiann-Ming Wu
學位類別:碩士
校院名稱:國立東華大學
系所名稱:應用數學系
學門:數學及統計學門
學類:數學學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
論文頁數:23
中文關鍵詞:深層類神經網路圖像辨識均場退火法梯度遞減學習法理論
外文關鍵詞:deep neural networkimage recognitionmean field theorygradient descent
相關次數:
  • 被引用被引用:0
  • 點閱點閱:185
  • 評分評分:
  • 下載下載:18
  • 收藏至我的研究室書目清單書目收藏:0
深層類神經網路目前已經被廣泛應用在機器學習上。
深層類神經網路包含多層隱藏層,具備強大輸入輸出對應的能力,透過可調變的神經鏈結仍保有許多彈性,深層類神經網路在不同的人工智慧框架下,扮演不同的角色,包含特徵擷取、降維、函數近似等等。
既有的深度學習學習方法已經結合多種,例如:Gradient descent、mini-batch、dropout、momentum等等。
Hinton提出的兩階段式深度學習方法已經被成功應用在影像、語音辨識上,第一階段為限制式波茲曼學習,restricted Boltzman machines , RBMs以及第二階段back propagation學習,第一階段的RBM學習給予了深層神經網路一個較佳的初始參數,第二階段學習則以第一階段得到的初始參數為基礎一步優化神經網路的內建參數。
本論文主要提出一個新的深度學習方法,目的是想要不藉由限制式波茲曼方法,將Hinton所提出的兩階段式深度學習方法改為一階段式學習方法,並有效達到降低mean-square error以及training error的目標。
新的方法為透過混合式均場退火法與梯度遞減學習法理論,在S-型激勵函數中增設一個參數β,代表退火溫度的倒數,β值越小時亂度越大,β值越大時亂度越小
讓β值從小慢慢變大,這個β從小慢慢變大的過程即稱作為退火過程,在退火過程中將最佳學習表現的網路參數記憶起來,新的方式可用來強化深層類神經網路的學習能力,也能夠結合既有的mini-batch、dropout、momentum等學習方法來提升學習效率和學習效果,本文以手寫字辨識為範例來說明新方法有效性。
Deep neural networks are widely used in machine learning. The deep neural network contains many hidden layers, with the powerful ability of corresponding input to output, through the adjustable neural interconnections. Deep neural network in different artificial intelligence framework, plays a different role, including feature extraction, dimensionality reduction, function approximation, and so on. The existing method of deep learning have been combined with a variety of methods, such as gradient descent, mini-batch, dropout, momentum, etc. Hinton’s two-stage learning method has been successfully applied in image recognition and speech recognition. The first stage is a restricted Boltzmann machine, RBMs[1], and the second stage is backpropagation learning[2] . The first stage of RBM learning gives a better initial parameter for the deep neural network. The second stage of learning is based on the initial parameters obtained in the first stage to optimize the built-in parameters of the neural network. This paper mainly proposes a new method of deep learning. The purpose is to avoid using the restricted Boltzmann method, rewrite Hinton’s two-stage method in one-stage method, and effectively achieve the goal of reducing mean square error and training error. The new method is based on the hybrid mean field annealing and gradient descent learning. In the S-type activation function, a parameter β represents the reciprocal of the annealing temperature. When the degree of entropy is greater, β value is smaller. When the degree of entropy is smaller, β value is greater. Let β grow from small to large gradually. The process is known as the annealing process. The optimal parameters in the process of annealing are recorded. The new method can be used to strength the learning ability of deep neural network, but also can combine the existing method, such as mini-batch, dropout, momentum to improve efficiency of learning. In this paper, handwriting recognition as an example is resolved to illustrate the effectiveness of new method.
I. Introduction……………………………………..1
II. Deep learning…………………………………..5
III. Expectation maximization……………………..11
IV. Hybrid mean field annealing and gradient descent methods……13
V. Numerical simulation………………………….17
VI. Conclusion……………………………………..21
References
[1]Hinton, G. E., Osindero, S. and Teh, Y.A fast learning algorithm for deep belief nets. Neural Computation 18, pp 1527-1554.(2006)
[2] Rumelhart, D. E., Hinton, G. E., and Williams, R. J. Learning representations by back-propagating errors. Nature, 323, 533--536.(1986)
[3] LeCun, Y., Bengio, Y. and Hinton, G. E. Deep Learning. Nature, Vol. 521, pp 436-444.(2015)
[4] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), pp 1929-1958.(2014)
[5] Moon, Todd K. "The expectation-maximization algorithm." IEEE Signal processing magazine 13.6 (1996): 47-60.
[6] Jiann-Ming Wu, M.H. Chen, Lin Z.H., Independent component analysis based on marginal density estimation using weighted Parzen windows, accepted by Neural Networks, 2008(SCI)
[7] Wu, Jiann-Ming. "Annealing by two sets of interactive dynamics." IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 34.3 (2004): 1519-1525.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top