跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.175) 您好!臺灣時間:2024/12/06 21:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王聖淵
研究生(外文):Sheng-Yuan Wang
論文名稱:加強式學習之經驗分享於分散式代理人之應用
論文名稱(外文):Knowledge Sharing Approaches Based on Reinforcement Learning for Distributed Agents System
指導教授:黃國勝黃國勝引用關係
指導教授(外文):Kao-Shing Hwang
學位類別:碩士
校院名稱:國立中山大學
系所名稱:電機工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:62
中文關鍵詞:分散式運算蟻群演算法經驗融合經驗分享增強式學習
外文關鍵詞:Ant colony algorithmDistributed computingReinforcement learningKnowledge sharingKnowledge merging
相關次數:
  • 被引用被引用:0
  • 點閱點閱:286
  • 評分評分:
  • 下載下載:29
  • 收藏至我的研究室書目清單書目收藏:0
為消弭在大群的學習代理人進行經驗分享時,複雜混亂的知識交換行為,並可利用經驗分享來快速取得有用的環境訊息,增強個別學習代理人本身不足的學習經驗,本文提出一個雲端整合資訊的機制,其各個學習代理人僅與雲端伺服器溝通,以此去除複雜的知識交換行為;且能收集所有代理人的學習經驗並加以融合,接著分享給經驗不足的代理人。代理人利用蟻群演算法中,費洛蒙機制的概念,作為上傳自身經驗時評估經驗重要性的依據,此依據將化為權重值,並用於伺服器合併多代理人的學習經驗。為因應大量的學習經驗資料,雲端伺服器採用分散式儲存系統的資料儲存架構。而處理此海量的資料則採用Apache Hadoop 的軟體框架,其資料處理方式-MapReduce,為分散式運算架構,能快速且有效的處理大量資料。各個學習代理人會向雲端伺服器索取融合後的學習經驗,並將此融合經驗再次與自身的經驗互相整合,以達到經驗分享的目的。最後本論文以自製的小型伺服器,並在多台PC上模擬總數為360隻的學習代理人,以隨機散佈於環境中的方式,同時在相同的環境中學習以實作本文所提之方法,證明此方法能有效改善學習效果。
Considering situations in a multi-agent system, if there are tremendous number of agents sharing knowledge with each other, it is complicated activities hard to be solved. This thesis proposed a method that all agents just connect with a server to alleviate the complexity of the experiences exchange activities. The server collects learning knowledge loaded from all the agents, merges the knowledge, and shares the knowledge to all agents which lack akin experiences. The agents utilized the proposed Pheromone Mechanism in Ant Colony Algorithm to evaluate whether an experience is worthy to upload to the server. The remained pheromone in the trace where states are visited along with becomes a weight for combining a collection of experiences on the server. Meanwhile, to deal with the problem of massive data processing, this thesis used the open source software-Apache Hadoop, along with the MapReduce programming model. The agents can take shared experiences integrated with their own knowledge to achieve knowledge sharing and increase the efficiency significantly. The proposed approach in this thesis was implemented by a homemade server and personal computers. The results of simulation with 360 learning agents demonstrate the performance of the proposed approach.
論文審定書 i
摘要 iii
Abstract iv
圖表目錄 ix
表格目錄 xi
第1章 導論 1
1.1 動機 1
1.2 論文架構 2
第2章 文獻探討 3
2.1 馬可夫決策過程 3
2.1.1 增強式學習法 4
2.1.2 Q-Learning 5
2.2 蟻群演算法 7
2.2.1蟻群演算法 7
2.2.2蟻群演算法之費洛蒙更新 8
2.3 分散式系統 10
2.3.1分散式儲存系統 11
2.3.2分散式運算 12
第3章 研究方法 14
3.1 建立複數學習代理人經驗分享機制 14
3.2使用蟻群理論設計加權函數 16
3.2.1費洛蒙機制 17
3.2.2加權函數 18
3.3 分散式系統應用 20
3.3.1資料儲存結構 20
3.3.2資料處理程序MapReduce與經驗融合法 22
3.4 整體流程與演算法 26
3.4.1個別學習-上傳模式 27
3.4.2經驗融合模式 29
3.4.3個別學習-下載模式 30
第4章 模擬實驗與實作結果 33
4.1 迷宮模擬實驗 33
4.2 實作結果 41
第4章 結論與未來展望 47
4.1 結論 47
4.2 未來展望 47
參考文獻 48
[1] A. V. Ivanov, and A. A. Petrovsky, “First-order Markov Property of The Auditory Spiking Neuron Model Response,” Signal Processing Conference, Florence, Italy, 4-8 Sept. 2006.
[2] K. I. Y. Inoto, H. Taguchi, and A. Gofuku, “A Study of Reinforcement Learning with Knowledge Sharing,” in Proc. of IEEE Int. Conf. on Robotics and Biomimetics, Okayama, Japan, pp. 175-179, Hong Kong, China, 22-26 Aug. 2004.
[3] Z. Jin, W. Y. Liu, and J. Jin, “State-Clusters Shared Cooperative Multi-Agent Reinforcement Learning,” Asian Control Conference ASCC, pp. 129-135, 27-29 Aug. 2009.
[4] M. N. Ahmadabadi, and M. Asadpour, “Ecpertness Based Cooperative Q-Learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 32, no. 1, pp. 1083-1094, Feb. 2002.
[5] B. N. Araabi, S. Mastoureshgh, and M. N. Ahmadabadi, “A Study on Expertise of Agents and Its Effects on Cooperative Q-Learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 2, pp. 1083-1094, Apr. 2007.
[6] A. Anuntapat, A. Thammano, and O. Wongwirat, “Searching Optimization Route by Using Pareto Solution with Ant Algorithm for Mobile Robot in Rough Terrain Environment,” Control, Automation, Robotics and Vision (ICARCV), International Conference, Phuket, Thailand, 13-15 Nov. 2016.
[7] J. Li, J. Cheng, Y. Zhao, F. Yang, Y. Huang, H. Chen, and R. Zhao, “A Comparison of General-Purpose Distributed Systems for Data Processing,” Big Data IEEE International Conference, pp. 378-383, Washington D.C., USA, 5-8 Dec. 2016.
[8] K. Ito, A. Gofuku, Y. Imoto, and M. Takeshita, “A study of reinforcement learning with knowledge sharing for distributed autonomous system,” Computational Intelligence in Robotics and Automation, Proceedings IEEE, pp. 1120-1125, Kobe, Japan, 16-20 July. 2003.
[9] J. Pinto, P. Jain, and T. Kumar, “Hadoop distributed computing clusters for fault prediction,” Computer Science and Engineering Conference ICSEC, Chiang Mai, Thailand, 14-17 Dec. 2016.
[10] T. Tateyama, S. Kawata, and Y. Shimomura, “Parallel Reinforcement Learning Systems using Exploration Agents and Dyna-Q Algorithm,” in Proc. SICE Annu. Conf., Takamatsu, Japan, pp. 2774-2778, Takamatsu, Japan, 17-20 Sept. 2007.
[11] M. Hussin, Y. C. Lee, and A. Y. Zomaya, “Efficient Energy Management using Adaptive Reinforcement Learning-based Scheduling in Large-Scale Distributed Systems,” in International Conf. on Parallel Proc., Sydney, Australia, pp. 385-393, Taipei City, Taiwan, 13-16 Sept. 2011.
[12] H. Karaoğuz, and H. Bozma, “Merging Appearance-Based Spatial Knowledge in Multirobot Systems,” Intelligent Robots and Systems (IROS), IEEE/RSJ International Conference, pp. 5107-5112, Daejeon, Korea, 9-14 Oct. 2016.
[13] K.S. Hwang, W. C. Jiang, and Y. J. Chen, “Model Learning and Knowledge Sharing for a Multiagent System with Dyna-Q Learning,” IEEE Transactions on Cybernetics, vol. 45, no. 5, pp. 964-976, May. 2015.
[14] K.S. Hwang, W. C. Jiang, Y. J. Chen, and W. H. Wang, “Reinforcement Learning with Model Sharing for Multi-Agent Systems,” System Science and Engineering ICSSE, pp. 293-296, Budapest, Hungary, 4-6 July. 2013.
[15] A. Lazarowska, “Parameters Influence on the Performance of an Ant Algorithm for Safe Ship Trajectory Planning,” Cybernetics (CYBCONF), IEEE International Conference, Gdynia, Poland, 24-26 June. 2015.
[16] X. Huang, H. Zhou, and W. Wu, “Hadoop Job Scheduling Based on Mixed Ant-Genetic Algorithm,” Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), International Conference, Xi''an, China, 17-19 Sept. 2015.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top