臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.59) 您好！臺灣時間：2025/10/17 02:38

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

許嘉麟

研究生(外文):

Jia-Lin Hsu

論文名稱:

運用強效式學習與案例式推論於多代理人合作的追捕競賽

論文名稱(外文):

Using Reinforcement Learning and Case-Based Reasoning in Multi-Agent Pursuit-Evasion Game

指導教授:

郭忠義

指導教授(外文):

Jong-Yih Kuo

口試委員:

馬尚彬、鄭永斌、李允中

口試委員(外文):

Shang-Pin Ma、Yung-Pin Cheng、Jonathan Lee

口試日期:

2011-07-18

學位類別:

碩士

校院名稱:

國立臺北科技大學

系所名稱:

資訊工程系研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2011

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

代理人、強效式學習、案例式推論、追捕遊戲

外文關鍵詞:

Agent、Reinforcement Learning、Q-Learning、Case-based Reasoning、Pursuit-evasion Game

相關次數:

被引用:0
點閱:153
評分:
下載:0
書目收藏:0

在追捕競賽中，多個追捕者需要協調彼此間的行為，來實現一個共同的目標，追捕到獵物。本論文針對在動態的環境中的追捕遊戲，提出一套追捕學習機制，本研究藉由訓練的方式處理環境的不確定性因素，而根據代理人間彼此合作與否分成兩種不同的學習方式，分別為強效式學習和案例式推論，使代理人能夠擁有記憶與學習的能力，以達成能更快追捕到獵物的目的。最後使用Repast(The Recursive Porous Agent Simulation Toolkit)實際模擬多代理人的追捕競賽。

In multi-agent pursuit-evasion game, pursuers need to coordinate the behavior of each other to achieve a common goal, catching the evader. In this paper, we propose a learning mechanism of capture in a dynamic environment of pursuit-evasion game. We deal with uncertainty in environment by using training and divide into two different ways of learning by cooperation or not. One is individual learning and the other is case-based reasoning. Therefore, agents can have memory and learning ability, so can catch evader more quickly. We demonstrate our approach by a pursuit-evasion game, and then we use Repast (The Recursive Porous Agent Simulation Toolkit) as the agent platform to implement our multi-agent system.

摘要 ii
ABSTRACT iii
誌謝 iv
目錄 v
表目錄 vii
圖目錄 viii
第一章緒論 1
1.1 前言 1
1.2 研究動機與目的 1
1.3 研究貢獻 3
1.4 章節編排 3
第二章文獻探討 4
2.1 強效式學習 4
2.2 案例式推論 5
2.3 多代理人系統 7
2.4 追捕競賽 8
第三章代理人學習與合作方法 11
3.1 代理人的個體學習 11
3.1.1 狀態(State) 12
3.1.2 動作(Action) 14
3.1.3 回饋(Reward) 15
3.2 代理人合作學習 16
3.2.1 案例表示 16
3.2.2 案例擷取 17
3.2.3 產生動作 18
3.2.4 案例重用 21
3.2.5 案例修改 21
3.3 代理人學習程序 24
第四章案例研究 27
4.1 問題描述 27
4.2 獵物的移動策略 28
4.2.1 隨機移動 28
4.2.2 順時鐘移動 28
4.2.3 逆時鐘移動 29
4.2.4 智慧型移動 30
4.2.5 逃避型移動 30
4.3 系統架構圖 31
4.4 系統實作 33
4.5 相關研究比較 35
4.6 實驗結果 37
4.6.1 實驗一 37
4.6.2 實驗二 38
4.6.3 實驗三 39
第五章結論與未來展望 41
參考文獻 42

[1]A. Aamodt and E. Plaza, “Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches,” AI Communications, vol. 7(1), pp. 39-59, 1994.
[2]A. Antoniades, H.J. Kim, S. Sastry, “Pursuit-evasion strategies for teams of multiple agents with incomplete information,” IEEE Conference on Decision and Control, vol. 1, pp. 756-761, 2003.
[3]A. Swaminathan, and K. S. Barber, “An Experience-Based Assembly Sequence Planner for Mechanical Assemblies,” IEEE Transactions on Robotics and Automation, vol. 12(2), pp. 252-267, 1996.
[4]Andreas Kolling, Stefano Carpin, “Pursuit-Evasion on Trees by Robot Teams,” IEEE Transactions on Robotics, vol. 26(1), pp. 32-47, 2010.
[5]Chern Han Yong, Risto Miikkulainen, “Coevolution of Role-Based Cooperation in Multiagent Systems,” IEEE Transactions on Autonomous Mental Development, vol. 1(3), pp. 170-186, 2009.
[6]D. Bienstock, P. Seymour, “Monotonicity in Graph Searching,” Journal of Algorithms, vol. 12(2), pp. 239-245, 1991.
[7]D. Hladek, J. Vaščak, P. Sinčak, “Multi-Robot Control System for Pursuit-Evasion Problem,” Journal of Electrical Engineering, vol. 60(3), pp. 143–148, 2009.
[8]Damien Ernst, Mevludin Glavic, and Louis Wehenkel, “Power Systems Stability Control: Reinforcement Learning Framework,” IEEE Transactions on Power Systems, vol. 19(1), pp.427-435, 2004.
[9]Daniel Hennessy and David Hinkle, “Applying case-based reasoning to autoclave loading,” IEEE Expert, vol. 7(5), pp. 21-26, 1992.
[10]Ferber, Jacques, Multi-Agent System: An Introduction to Distributed Artificial Intelligence, Harlow: Addison Wesley Longman, 1999.
[11]G. Gottlob, N. Leone, F. Scarcello, “Robbers, Marshals, and Guards: Game Theoretic and Logical Characterizations of Hypertree Width”, Journal of Computer and System Sciences. Vol. 66, pp. 775-808, 2003.
[12]G. Oshanin, O. Vasilyev, P. L. Krapivsky, and J. Klafter, “Survival of an evasive prey,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106(33), pp. 13696-13701, 2009
[13]H. Ahn and K. Kim, “Global optimization of case-based reasoning for breast cytology diagnosis,” Expert Systems with Applications, vol. 36(1), pp. 252-267, 1996.
[14]J. P. Hespanha, G. J. Pappas, and M. Prandini, “Greedy control for hybrid pursuit-evasion games,” In Proceedings of the European Control Conference, pp. 2621-2626, 2001.
[15]Janet L. Kolodner, “An Introduction to Case-Based Reasoning,” Artificial Intelligence Review, vol. 3, pp. 3-34, 1992.
[16]Jong Yih Kuo, Chien Feng Hsu, “Applying Assimilation and Accommodation for Cooperative Learning of Multi-Agent Pursuit-Evasion Strategies,” 2010 International Conference on Manufacturing and Engineering Systems, 2010.
[17]Jong Yih Kuo, He Zhi Lin, “Cooperative RoboCup Agents Using Genetic Case-Based Reasoning,” IEEE International Conference on Digital Object Identifier Systems, Man and Cybernetics, pp. 613-618, 2008.
[18]K. P. Sycara, “Multiagent Systems”, AI Magazine, vol. 19, pp. 79-92, 1998.
[19]L. J. Guibas, J.-C. Latombe, S. M. LaValle, D. Lin, and R. Motwani, “A visibility-based pursuit-evasion problem,” International Journal of Computational Geometry and Applications, vol.9, pp. 471-493, 1999.
[20]L.M. Kirousis, C.H. Papadimitriou, “Interval Graphs and Searching,” Discrete Mathematics, vol. 55, pp. 181-184, 1985.
[21]L.M. Kirousis, C.H. Papadimitriou, “Searching and Pebbling,” Theoretical Computer Science, vol.47, pp. 205-218, 1986.
[22]Larry M. Stephens, Matthias B. Merx, “The effect of agent control strategy on the performance of a dai pursuit problem,” In Proceedings of the 10th International Workshop on DAI, 1990.
[23]Leslie Pack Kaelblin, Michael L. Littma, and Andrew W. Moore, “Reinforcement Learning: A Survey,” Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996.
[24]M. Aigner, M. Fromme, “A Game of Cops and Robbers”, Discrete Applied Mathematics, vol. 8, pp. 1-11, 1984.
[25]M. Yamashita, H. Umemoto, I. Suzuki, and T. Kameda, “Searching for Mobile Intruders in a Polygonal Region by a Group of Mobile Searchers,” Algorithmica, vol. 31(2), pp. 208-236, 2001.
[26]Michelle McPartland and Marcus Gallagher, “Reinforcement Learning in First Person Shooter Games,” IEEE Transactions on Computational Intelligence and AI in Games, vol. 3(1), pp. 43-56, 2011.
[27]N. Vlassis, “A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence,” Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool, 2007.
[28]N.N. Petrov, M.A. Teteryatnikova, “Some Problems of the Search on Graphs with Retaliation,” Vestnik St. Petersburg University Mathmatic, vol. 37, pp. 37-43, 2004.
[29]Nan-Peng Yu, Chen-Ching Liu, and James Price, “Evaluation of Market Rules Using a Multi-Agent System Method,” IEEE Transactions on Power Systems, vol. 25(1), pp. 470-479, 2010.
[30]P. Dayan and B.W. Balleine, “Reward, Motivation, Review and Reinforcement Learning,” Neuron, vol. 36, pp. 285-298, 2002.
[31]P. Perner, “An architecture for a CBR image segmentation system,” Engineering Applications of Artificial Intelligence, vol. 12(6), pp. 749-759, 1999.
[32]P.G. Balaji and D. Srinivasa, “Multi-Agent System in Urban Traffic Signal Control,” IEEE Computational Intelligence Magazine, vol. 5(4), pp. 43-51, 2010.
[33]R. Nowakowski, P. Winkler, “Vertex-to-Vertex Pursuit in a Graph”, Discrete Math. Vol. 43, pp. 235-239, 1983.
[34]R. R. Brooks, Jing-En Pang, and C. Grifﬁn, “Game and Information Theory Analysis of Electronic Countermeasures in Pursuit-Evasion Games,” IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, vol. 38(6), pp. 1281-1294, 2008.
[35]R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, Cambridge, MA, 1998.
[36]R. Vidal, O. Shakernia, H.J. Kim, D.H. Shim, S. Sastry, “Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation,” IEEE Transactions on Robotics and Automation, vol. 18, pp. 662-669, 2002.
[37]Robert H. Crites and Andrew G. Barto, “Improving Elevator Performance Using Reinforcement Learning,” Advances in neural information processing systems 8, pp. 1017-1023, 1996.
[38]S.F. Railsback, S.L. Lytinen, S.K. Jackson, “Agent-Based Simulation Platforms: Review and Development Recommendations,” SIMULATION, vol. 82(9), pp. 609-623, 2006.
[39]Shaunak D. Bopardikar, Francesco Bullo, and Joao P. Hespanha, “On Discrete-Time Pursuit-Evasion Games With Sensing Limitations,” IEEE Transactions on Robotics, vol. 24(6), pp. 1429–1439, 2008.
[40]Shin I. Nishimura, Takashi Ikegami, “Emergence of collective strategies in a prey-predator game model,” Artificial Life, vol. 3, pp. 243-260, 1997.
[41]Shotaro Kamio and Hitoshi Iba, “Adaptation Technique for Integrating Genetic Programming and Reinforcement Learning for Real Robots,” IEEE Transactions on Evolutionary Computation, vol. 9(3), pp. 318-333, 2005.
[42]T. Haynes and S. Sen, “Evolving behavioral strategies in predators and prey,” Adaptation and Learning in Multiagent Systems, Lecture Notes in Artificial Intelligence, pp. 113-126, 1995.
[43]T. Mukhopadhyay, S.S. Vicinanza, and M.J. Prietula, “Examining the Feasibility of a Case-Based Reasoning Model for Software Effort Estimation,” MIS Quarterly, vol. 16(2), pp. 155-171, 1992.
[44]T. Parsons, “Pursuit-Evasion in a Graph,” Lecture Notes in Mathmatics, Theory and Application of Graphs, vol. 642, pp. 426-441, 1976.
[45]V.Y. Andrianov, N.N. Petrov, “Graph Searching Problems with the Counteraction, in: Game Theory and Applications,” Game theory and applications, vol.10, pp. 1-12, 2005.
[46]Xu Chu Ding, Amir R. Rahmani and Magnus Egerstedt, “Multi-UAV Convoy Protection: An Optimal Approach to Path Planning and Coordination,” IEEE Transactions on Robotics, vol. 26(2), pp. 256-268, 2010.
[47]http://ascape.sourceforge.net/
[48]http://ccl.northwestern.edu/netlogo
[49]http://education.mit.edu/starlogo
[50]http://repast.sourceforge.net/
[51]http://www.cs.gmu.edu/~eclab/projects/mason/
[52]http://www.swarm.org

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	多重工作者之競爭與合作-以排程為例
2.	個人化商業資訊比較
3.	在虛擬戲劇中根據角色的恐懼情緒學習審問的策略
4.	類神經網路自組織增強式學習模型
5.	運用同化與調適於多代理人合作學習的追捕策略
6.	適應性模糊化類神經網路及其應用
7.	使用增強式學習法改善一個簡易的臺灣股價指數期貨當沖交易系統
8.	具強化學習機制之嵌入式動態電源管理代理人
9.	以多維族群式共生進化演算法為基礎的改良式Q學習
10.	大尺度網路中的增強式智能學習與協同合作安全政策管理
11.	以加強式學習建構機器人行為融合演算法
12.	用於長期工作或行動網路下的多代理人聯盟組成
13.	以加強式學習模式建構決策樹以及其於狀態空間分隔之應用
14.	簡單且彈性化的軟體代理人通訊協定之探討與實作
15.	知識分享於網頁型案例式推理系統之研究與應用

無相關期刊

1.	運用物件色彩辨識及分類方法於即時視訊監控系統
2.	以背景相減為基礎的背景分割法運用於室內監視系統
3.	於藍寶石基板濺鍍摻銻氧化鋅薄膜之研究
4.	網頁元件偵測與動態網頁測試方法
5.	人體三維量測與重建技術之研究
6.	改善USB產品待機轉換效率之直流降壓轉換器
7.	便利商店負載分析及卸載策略研究
8.	使用遞增冗餘尾端迴旋碼之HARQ機制在移動式WiMAX系統之模擬
9.	灰關聯理論應用於液晶玻璃面板金屬薄膜雷射划線參數最佳化分析
10.	考慮風險管理因素之內部控制架構研究
11.	以UC3854AN實現交錯式升壓型功率因數修正器之研製
12.	跨平台軟體持續整合樣式之已知案例探討
13.	物件導向程式演化至觀點導向程式之回歸測試
14.	以原位聚合法製備PA 6/Nano-Ag複合材料之功能性研究
15.	颱洪災害衝擊下的堤後低窪地區土地規劃管理防洪減災策略之研究－以筏子溪下游為例

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室