跳到主要內容

臺灣博碩士論文加值系統

(35.172.136.29) 您好!臺灣時間:2021/08/02 18:22
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:李漢昌
研究生(外文):Li Han Chang
論文名稱:基於ADHDP架構之自學平衡控制系統-以車桿和樑球平衡問題為例。
論文名稱(外文):Action-Dependent Heuristic Dynamic Programming-Based Self-Learning Balance Control System-The Example of Cart-Pole and Beam-Ball Balance Problems.
指導教授:夏傳儀
指導教授(外文):Shiah C. Y.
學位類別:碩士
校院名稱:佛光大學
系所名稱:資訊學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:60
中文關鍵詞:機器學習平衡控制強化學習適評者設計車桿平衡問題樑球平衡問題
外文關鍵詞:Machine LearningBalance ControlReinforcement LearningAdaptive Critic DesignsCart-Pole Balance ProblemBeam-Ball Balance Problem
相關次數:
  • 被引用被引用:0
  • 點閱點閱:175
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
機器學習(Machine Learning)在人工智慧的範疇中是重要的研究工作之一,且機器學習的實現也已隨著科學技術的進步而逐漸成為可能。本論文以機器自我學習控制為目的,利用近似強化學習(Reinforcement Learning)的概念設計自學平衡控制系統。本文以適評者設計(Adaptive Critic Designs)中的ADHDP (Action-Dependent Heuristic Dynamic Programming)作為自學平衡控制系統的系統架構。實驗系統自我學習車桿平衡問題(Cart-Pole Balance Problem)及樑球平衡問題(Beam-Ball Balance Problem)的模擬平衡試驗。車桿平衡問題和樑球平衡問題是探討控制理論時典型的研究課題,且由於兩平衡問題的系統結構並不複雜,因此,被廣泛的應用在驗證各種控制理論與教學研究上。
在本文的研究中將利用電腦模擬的方式呈現系統自我學習平衡的過程,並以MATLAB程式撰寫系統模擬介面。研究的結果顯示以ADHDP架構而成的自學平衡控制系統能在任意初始的系統狀態條件下快速地學會平衡車桿平衡系統以及樑球平衡系統。
Machine learning is one important part in the field of artificial intelligence. And the realization of Machine learning gradually becomes possible with the advance of technology. Aiming at machine self-learning, this thesis used to approximate Reinforcement learning to design the Self-learning balance control system. This study took ADHDP (Action-Dependent Heuristic Dynamic Programming) of Adaptive critic designs as the structure of Self-learning balance control system, and carried out an experiment in the self-learning cart-pole balance problem and bean-ball balance problem. The cart-pole balance and the bean-ball balance are the typical problems as far as the control theory is concerned. Because the structure of balance system is not complicated, they’re broadly applied to various control theory experiments and academic researches.
In this research, the process of the system’s self-learning balance is represented by computer simulation. The system’s simulation interface is written with MATLAB. The study result indicates that the ADHDP-based self-learning balance control system is able to learn quickly, in the condition of any primary system, the cart-pole balance system and the beam-ball balance system.
摘要 i
Abstract ii
目錄 iii
圖目錄 v
表目錄 vi
第一章 緒論 1
1.1 研究動機 1
1.2 研究目的 1
1.3 研究範圍 2
1.4 研究流程 2
1.5 論文架構 4
第二章 相關理論沿革與ADHDP演算法架構 5
2.1 適評者設計相關理論沿革 5
2.2 HDP演算法架構 7
2.3 ADHDP演算法架構 10
第三章 問題研究與自學平衡控制系統設計 14
3.1 車桿平衡問題 14
3.1.1 車桿平衡系統模型及問題描述 14
3.1.2 車桿平衡系統動態方程式 15
3.1.3 車桿平衡系統控制方法之文獻探討 16
3.1.4 自學車桿平衡問題研究方法 17
3.2 樑球平衡問題 18
3.2.1 樑球平衡系統模型及問題描述 19
3.2.2 樑球平衡系統動態方程式 20
3.2.3 樑球平衡系統控制方法之文獻探討 21
3.2.4 自學樑球平衡問題研究方法 23
3.3 自學平衡控制系統設計 24
3.3.1 執行者網路設計 25
3.3.2 適評者網路設計 25
3.3.3 自學平衡控制系統架構及學習演算法 26
第四章 實驗結果與分析 30
4.1 車桿平衡系統自學控制實驗 30
4.1.1 自學車桿平衡控制實驗設計與說明 30
4.1.2 自學車桿平衡控制實驗結果與分析 32
4.2 樑球平衡系統自學控制實驗 39
4.2.1 自學樑球平衡控制實驗設計與說明 39
4.2.2 自學樑球平衡控制實驗結果與分析 41
第五章 結論與後續研究方向 49
5.1 結論 49
5.2 後續研究方向 50
參考文獻 52
[1] T. M. Mitchell, Machine Learning. Singapore: McGraw-Hill, 1997.
[2] I. P. Pavlov, Conditional Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex. London: Oxford Univ. Press, 1927.
[3] S. Grossberg, “Pavlovian patten learning by nonlinear neural networks,” in Proc. Nat. Academy Sci., pp. 828-831, 1971.
[4] R. S. Sutton, Reinforcement Learning. Boston, MA: Kluwer, 1996.
[5] R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
[6] R. Bellman and S. Dreyfus, Applied Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1962.
[7] D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models. Englewood Cliffs, NJ: Prentice-Hall, 1987.
[8] R. S. Sutton, “Learning to predict by the methods of temporal difference,” Machine Learning, vol. 3, pp. 9-44, 1988.
[9] D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, Cambridge, MA: MIT Press, 1986.
[10] S. Haykin, Neural Networks: A Comprehensive Foundation. Englewood Cliffs, NJ: Prentice-Hall, 1999.
[11] P. Werbos, “Advanced forecasting methods for global crisis warning and models of intelligence,” General System Yearbook, vol. 22, pp. 25-38, 1977.
[12] D. A. White and D.A. Sofge, Eds., Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. New York: Van Nostrand Reinhold, 1992.
[13] W. T. Miller, R. S. Sutton and P. J. Werbos, Eds., Neural Networks for Control. Cambridge, MA: MIT Press, 1990.
[14] D. Liu, X. Xiong, and Y. Zhang, “Action-dependent adaptive critic designs,” Proceedings of the INNS-IEEE International Joint Conference on Neural Network, Washington, DC, pp. 990-995, 2001.
[15] D. V. Prokhorov and D. C. Wunsch, “Adaptive critic designs,” IEEE Trans. on Neural Networks, vol. 8, pp. 997-1007, 1997.
[16] D. V. Prokhorov, R. A. Santiago and D. C. Wunsch, “Adaptive critic designs: A case study for neurocontrol,” Neural Networks, vol. 8, pp. 1367-1372, 1995.
[17] R. Zaman, D. Prokhorov and D. Wunsch, “Adaptive critic design in learning to play game of go,” Proceedings of ICNN, Houston, vol. 1, pp. 1-4, 1997.
[18] A. G. Barto and R. S. Sutton, and C. W. Anderson, “Neuron like adaptive elements that can solve difficult learning control problems,” IEEE Trans. Syst., Man, Cybern., vol. 13, pp. 834-847, 1983.
[19] S. Kawaji and K. Ogasawara, “Swing up control of a pendulum using genetic algorithms,” Proceedings of the 33 IEEE Conference on Decision and Control., pp. 3530-3532, 1994.
[20] C. E. Lin and Y. R. Sheu, “A hybrid-control approach for pendulum-car control,” IEEE Trans. on Industrial Electronics., vol. 39, no. 3, pp. 208-214, 1992.
[21] I. I. Kim and J. H. Lee, “A new approach to adaptive membership function for fuzzy interface system,” Knowledge-Based Intelligent Information Engineering Systems, Third International conference., pp. 112-116, 1999.
[22] S. J. Huang and C. L. Huang, “Control of an inverted pendulum using gray prediction model,” IEEE Trans. on Industry Applications., vol. 36, no. 2, pp. 452-458, 2000.
[23] M. Widjaja and S.Yurkovich, “Intelligent control for swing up and balancing of inverted pendulum system,” Proceedings of the 4 IEEE Conference on Control Applications., pp. 534-542, 1995.
[24] H. Osinga and J. Hauser, “On geometry of optimal control: the inverted pendulum example,” Proceedings of American Control Conference., pp. 25-27, 2001.
[25] S. U. Cheang and W. J. Chen, “Stabilizing control of an inverted pendulum system based on loop shaping design procedure,” IEEE APEC., pp. 272-280, 1997.
[26] J. Hauser, S. Sastry and P. Kokotovic, “Nonlinear control via approximation input-output linearization: the ball and beam example,” IEEE Trans. on automatic control., vol. 37, no. 3, pp. 392-398, 1992.
[27] P. Kokotovic, “The joy of feedback: nonlinear and adaptive,” IEEE Control Systems Magazine., vol. 12, no. 3, pp.7-17, 1992.
[28] C. Barbu, R. Sepulche, W. Lin and P. V. Kokotovic, “Global asymptotic stabilization of the ball-and-beam system,” proceedings of the 36 IEEE Conference on Decision and Control., vol. 3, pp. 2351-2355, 1997.
[29] E. C. Gwo and J. Hauser, “A numerical approach for approximate feedback linearization,” Proceedings of the American Control Conference., pp.1495-1499, 1993.
[30] M. C. Lai, C. C. Chien, C. Y. Cheng, Z. Xu and Y. Zhang, “Nonlinear tracking Control via approximate backstepping,” Proceedings of the American Control Conference., pp.1339-1343, 1994.
[31] C. J. Tomlin and S. Sastry, “Switching through singularities,” Proceedings of the 36 IEEE Conference on Decision and Control., vol. 1, pp. 1-6, 1997.
[32] M. A. Marra, B. E. Boling and B. L. Walcott, “Genetic control of a ball-beam system,” Proceedings of the 5 IEEE Conference on Control Applications., pp. 608-613, 1996.
[33] L. X. Wang and J. M. Mendel, “Fuzzy basis functions, universal approximate, and orthogonal least-squares learning,” IEEE Trans. on Neural Networks., vol. 3, no. 5, 1992.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊