跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.59) 您好!臺灣時間:2025/10/17 02:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:許嘉麟
研究生(外文):Jia-Lin Hsu
論文名稱:運用強效式學習與案例式推論於多代理人合作的追捕競賽
論文名稱(外文):Using Reinforcement Learning and Case-Based Reasoning in Multi-Agent Pursuit-Evasion Game
指導教授:郭忠義郭忠義引用關係
指導教授(外文):Jong-Yih Kuo
口試委員:馬尚彬鄭永斌李允中
口試委員(外文):Shang-Pin MaYung-Pin ChengJonathan Lee
口試日期:2011-07-18
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:資訊工程系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:中文
論文頁數:47
中文關鍵詞:代理人強效式學習案例式推論追捕遊戲
外文關鍵詞:AgentReinforcement LearningQ-LearningCase-based ReasoningPursuit-evasion Game
相關次數:
  • 被引用被引用:0
  • 點閱點閱:153
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在追捕競賽中,多個追捕者需要協調彼此間的行為,來實現一個共同的目標,追捕到獵物。本論文針對在動態的環境中的追捕遊戲,提出一套追捕學習機制,本研究藉由訓練的方式處理環境的不確定性因素,而根據代理人間彼此合作與否分成兩種不同的學習方式,分別為強效式學習和案例式推論,使代理人能夠擁有記憶與學習的能力,以達成能更快追捕到獵物的目的。最後使用Repast(The Recursive Porous Agent Simulation Toolkit)實際模擬多代理人的追捕競賽。

In multi-agent pursuit-evasion game, pursuers need to coordinate the behavior of each other to achieve a common goal, catching the evader. In this paper, we propose a learning mechanism of capture in a dynamic environment of pursuit-evasion game. We deal with uncertainty in environment by using training and divide into two different ways of learning by cooperation or not. One is individual learning and the other is case-based reasoning. Therefore, agents can have memory and learning ability, so can catch evader more quickly. We demonstrate our approach by a pursuit-evasion game, and then we use Repast (The Recursive Porous Agent Simulation Toolkit) as the agent platform to implement our multi-agent system.

摘 要 ii
ABSTRACT iii
誌 謝 iv
目 錄 v
表目錄 vii
圖目錄 viii
第一章 緒 論 1
1.1 前言 1
1.2 研究動機與目的 1
1.3 研究貢獻 3
1.4 章節編排 3
第二章 文獻探討 4
2.1 強效式學習 4
2.2 案例式推論 5
2.3 多代理人系統 7
2.4 追捕競賽 8
第三章 代理人學習與合作方法 11
3.1 代理人的個體學習 11
3.1.1 狀態(State) 12
3.1.2 動作(Action) 14
3.1.3 回饋(Reward) 15
3.2 代理人合作學習 16
3.2.1 案例表示 16
3.2.2 案例擷取 17
3.2.3 產生動作 18
3.2.4 案例重用 21
3.2.5 案例修改 21
3.3 代理人學習程序 24
第四章 案例研究 27
4.1 問題描述 27
4.2 獵物的移動策略 28
4.2.1 隨機移動 28
4.2.2 順時鐘移動 28
4.2.3 逆時鐘移動 29
4.2.4 智慧型移動 30
4.2.5 逃避型移動 30
4.3 系統架構圖 31
4.4 系統實作 33
4.5 相關研究比較 35
4.6 實驗結果 37
4.6.1 實驗一 37
4.6.2 實驗二 38
4.6.3 實驗三 39
第五章 結論與未來展望 41
參考文獻 42



[1]A. Aamodt and E. Plaza, “Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches,” AI Communications, vol. 7(1), pp. 39-59, 1994.
[2]A. Antoniades, H.J. Kim, S. Sastry, “Pursuit-evasion strategies for teams of multiple agents with incomplete information,” IEEE Conference on Decision and Control, vol. 1, pp. 756-761, 2003.
[3]A. Swaminathan, and K. S. Barber, “An Experience-Based Assembly Sequence Planner for Mechanical Assemblies,” IEEE Transactions on Robotics and Automation, vol. 12(2), pp. 252-267, 1996.
[4]Andreas Kolling, Stefano Carpin, “Pursuit-Evasion on Trees by Robot Teams,” IEEE Transactions on Robotics, vol. 26(1), pp. 32-47, 2010.
[5]Chern Han Yong, Risto Miikkulainen, “Coevolution of Role-Based Cooperation in Multiagent Systems,” IEEE Transactions on Autonomous Mental Development, vol. 1(3), pp. 170-186, 2009.
[6]D. Bienstock, P. Seymour, “Monotonicity in Graph Searching,” Journal of Algorithms, vol. 12(2), pp. 239-245, 1991.
[7]D. Hladek, J. Vaščak, P. Sinčak, “Multi-Robot Control System for Pursuit-Evasion Problem,” Journal of Electrical Engineering, vol. 60(3), pp. 143–148, 2009.
[8]Damien Ernst, Mevludin Glavic, and Louis Wehenkel, “Power Systems Stability Control: Reinforcement Learning Framework,” IEEE Transactions on Power Systems, vol. 19(1), pp.427-435, 2004.
[9]Daniel Hennessy and David Hinkle, “Applying case-based reasoning to autoclave loading,” IEEE Expert, vol. 7(5), pp. 21-26, 1992.
[10]Ferber, Jacques, Multi-Agent System: An Introduction to Distributed Artificial Intelligence, Harlow: Addison Wesley Longman, 1999.
[11]G. Gottlob, N. Leone, F. Scarcello, “Robbers, Marshals, and Guards: Game Theoretic and Logical Characterizations of Hypertree Width”, Journal of Computer and System Sciences. Vol. 66, pp. 775-808, 2003.
[12]G. Oshanin, O. Vasilyev, P. L. Krapivsky, and J. Klafter, “Survival of an evasive prey,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106(33), pp. 13696-13701, 2009
[13]H. Ahn and K. Kim, “Global optimization of case-based reasoning for breast cytology diagnosis,” Expert Systems with Applications, vol. 36(1), pp. 252-267, 1996.
[14]J. P. Hespanha, G. J. Pappas, and M. Prandini, “Greedy control for hybrid pursuit-evasion games,” In Proceedings of the European Control Conference, pp. 2621-2626, 2001.
[15]Janet L. Kolodner, “An Introduction to Case-Based Reasoning,” Artificial Intelligence Review, vol. 3, pp. 3-34, 1992.
[16]Jong Yih Kuo, Chien Feng Hsu, “Applying Assimilation and Accommodation for Cooperative Learning of Multi-Agent Pursuit-Evasion Strategies,” 2010 International Conference on Manufacturing and Engineering Systems, 2010.
[17]Jong Yih Kuo, He Zhi Lin, “Cooperative RoboCup Agents Using Genetic Case-Based Reasoning,” IEEE International Conference on Digital Object Identifier Systems, Man and Cybernetics, pp. 613-618, 2008.
[18]K. P. Sycara, “Multiagent Systems”, AI Magazine, vol. 19, pp. 79-92, 1998.
[19]L. J. Guibas, J.-C. Latombe, S. M. LaValle, D. Lin, and R. Motwani, “A visibility-based pursuit-evasion problem,” International Journal of Computational Geometry and Applications, vol.9, pp. 471-493, 1999.
[20]L.M. Kirousis, C.H. Papadimitriou, “Interval Graphs and Searching,” Discrete Mathematics, vol. 55, pp. 181-184, 1985.
[21]L.M. Kirousis, C.H. Papadimitriou, “Searching and Pebbling,” Theoretical Computer Science, vol.47, pp. 205-218, 1986.
[22]Larry M. Stephens, Matthias B. Merx, “The effect of agent control strategy on the performance of a dai pursuit problem,” In Proceedings of the 10th International Workshop on DAI, 1990.
[23]Leslie Pack Kaelblin, Michael L. Littma, and Andrew W. Moore, “Reinforcement Learning: A Survey,” Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996.
[24]M. Aigner, M. Fromme, “A Game of Cops and Robbers”, Discrete Applied Mathematics, vol. 8, pp. 1-11, 1984.
[25]M. Yamashita, H. Umemoto, I. Suzuki, and T. Kameda, “Searching for Mobile Intruders in a Polygonal Region by a Group of Mobile Searchers,” Algorithmica, vol. 31(2), pp. 208-236, 2001.
[26]Michelle McPartland and Marcus Gallagher, “Reinforcement Learning in First Person Shooter Games,” IEEE Transactions on Computational Intelligence and AI in Games, vol. 3(1), pp. 43-56, 2011.
[27]N. Vlassis, “A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence,” Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool, 2007.
[28]N.N. Petrov, M.A. Teteryatnikova, “Some Problems of the Search on Graphs with Retaliation,” Vestnik St. Petersburg University Mathmatic, vol. 37, pp. 37-43, 2004.
[29]Nan-Peng Yu, Chen-Ching Liu, and James Price, “Evaluation of Market Rules Using a Multi-Agent System Method,” IEEE Transactions on Power Systems, vol. 25(1), pp. 470-479, 2010.
[30]P. Dayan and B.W. Balleine, “Reward, Motivation, Review and Reinforcement Learning,” Neuron, vol. 36, pp. 285-298, 2002.
[31]P. Perner, “An architecture for a CBR image segmentation system,” Engineering Applications of Artificial Intelligence, vol. 12(6), pp. 749-759, 1999.
[32]P.G. Balaji and D. Srinivasa, “Multi-Agent System in Urban Traffic Signal Control,” IEEE Computational Intelligence Magazine, vol. 5(4), pp. 43-51, 2010.
[33]R. Nowakowski, P. Winkler, “Vertex-to-Vertex Pursuit in a Graph”, Discrete Math. Vol. 43, pp. 235-239, 1983.
[34]R. R. Brooks, Jing-En Pang, and C. Griffin, “Game and Information Theory Analysis of Electronic Countermeasures in Pursuit-Evasion Games,” IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, vol. 38(6), pp. 1281-1294, 2008.
[35]R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, Cambridge, MA, 1998.
[36]R. Vidal, O. Shakernia, H.J. Kim, D.H. Shim, S. Sastry, “Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation,” IEEE Transactions on Robotics and Automation, vol. 18, pp. 662-669, 2002.
[37]Robert H. Crites and Andrew G. Barto, “Improving Elevator Performance Using Reinforcement Learning,” Advances in neural information processing systems 8, pp. 1017-1023, 1996.
[38]S.F. Railsback, S.L. Lytinen, S.K. Jackson, “Agent-Based Simulation Platforms: Review and Development Recommendations,” SIMULATION, vol. 82(9), pp. 609-623, 2006.
[39]Shaunak D. Bopardikar, Francesco Bullo, and Joao P. Hespanha, “On Discrete-Time Pursuit-Evasion Games With Sensing Limitations,” IEEE Transactions on Robotics, vol. 24(6), pp. 1429–1439, 2008.
[40]Shin I. Nishimura, Takashi Ikegami, “Emergence of collective strategies in a prey-predator game model,” Artificial Life, vol. 3, pp. 243-260, 1997.
[41]Shotaro Kamio and Hitoshi Iba, “Adaptation Technique for Integrating Genetic Programming and Reinforcement Learning for Real Robots,” IEEE Transactions on Evolutionary Computation, vol. 9(3), pp. 318-333, 2005.
[42]T. Haynes and S. Sen, “Evolving behavioral strategies in predators and prey,” Adaptation and Learning in Multiagent Systems, Lecture Notes in Artificial Intelligence, pp. 113-126, 1995.
[43]T. Mukhopadhyay, S.S. Vicinanza, and M.J. Prietula, “Examining the Feasibility of a Case-Based Reasoning Model for Software Effort Estimation,” MIS Quarterly, vol. 16(2), pp. 155-171, 1992.
[44]T. Parsons, “Pursuit-Evasion in a Graph,” Lecture Notes in Mathmatics, Theory and Application of Graphs, vol. 642, pp. 426-441, 1976.
[45]V.Y. Andrianov, N.N. Petrov, “Graph Searching Problems with the Counteraction, in: Game Theory and Applications,” Game theory and applications, vol.10, pp. 1-12, 2005.
[46]Xu Chu Ding, Amir R. Rahmani and Magnus Egerstedt, “Multi-UAV Convoy Protection: An Optimal Approach to Path Planning and Coordination,” IEEE Transactions on Robotics, vol. 26(2), pp. 256-268, 2010.
[47]http://ascape.sourceforge.net/
[48]http://ccl.northwestern.edu/netlogo
[49]http://education.mit.edu/starlogo
[50]http://repast.sourceforge.net/
[51]http://www.cs.gmu.edu/~eclab/projects/mason/
[52]http://www.swarm.org


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊