跳到主要內容

臺灣博碩士論文加值系統

(2600:1f28:365:80b0:8005:376a:2d98:48cd) 您好!臺灣時間:2025/01/18 09:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃俊嵃
研究生(外文):Chun-Yan Huang
論文名稱:基於強化學習技術之四軸無人機穩定控制的PID參數設計
論文名稱(外文):Stable Control for a Quadrotor with PID Coefficients Assigned by the Reinforcement Learning Technique
指導教授:蔡智強蔡智強引用關係
口試委員:袁世一劉建興
口試日期:2017-07-27
學位類別:碩士
校院名稱:國立中興大學
系所名稱:電機工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:40
中文關鍵詞:四軸無人機強化學習PID控制
外文關鍵詞:QuadrotorReinforcement LearningPID control
相關次數:
  • 被引用被引用:1
  • 點閱點閱:890
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來多軸無人機蓬勃發展,可被運用於空拍、搜救,監控等等,然而飛行期間易受到環境因素的干擾,因此,本論文研究如何利用強化學習來加強四軸無人機飛行控制的穩定性。其中,強化學習法亦稱之為加強式學習法,是近年來機器學習和人工智慧領域的學習方法之一,更是代理人系統最常採用的演算法,適合用在未知的環境下。尤其,於學習過程中,系統不須要建立模型,如此便可以省去模型建立的事前運算與訓練。具體而言,代理人根據策略採取行動過後,環境會給予反饋,於是便可從經驗中學習,讓整體效益最佳化。於此篇論文採用基於強化學習調整的PID控制參數於未知環境中維持穩定姿態。
Due to the rapid development of unmanned aerial vehicles (UAVs), they can be applied to many aspects, like rescue, sports, and entertainment. However, the autonomous control of UAVs is still a difficult problem to solve. Reinforcement learning (RL) method, which is a kind of unsupervised learning algorithm, is widely used in motion control and motion learning of robots. Such a learning algorithm can adapt itself to the model error even without building the model and can also gradually approximate to the real system through the learning process. Specifically, after taking an action according to the policy, the learning agent receives a numerical reward for every state transition from the environment and then learns from experiences to improve its performance such that it will achieve optimization eventually. In this thesis, the reinforcement learning method is applied to the control of a quadrotor. In particular, we exploit the PID control coefficients tuned by reinforcement learning to a quadrotor to maintain its stable flying gesture in unknown environments.
摘要 i
Abstract ii
目錄 iii
圖目錄 v
第1章 序論 1
1.1 研究動機 1
1.2 相關研究與文獻探討 1
1.3 論文架構 2
第2章 系統架構 3
2.1 四軸飛行器硬體設備 3
第3章 強化學習 9
3.1 強化學習架構 9
3.2 馬可夫決策程序(Markov Decision Process, MDP) 10
3.3 價值函數 11
3.4 政策(Policy) 13
3.5 時間差分學習(Temporal Difference Learning) 14
3.6 Actor-Critic法 17
第4章 四軸無人機模型 19
4.1 四軸無人機基本架構 19
4.2 四軸飛行器姿態定義 20
4.3 三維空間座標轉換 21
4.4 四元數運算 23
4.5 四軸飛行器動力模型 26
4.5.1 馬達動力學 27
4.6 PID控制系統 29
第5章 基於強化學習調整之PID控制參數 31
5.1 使用強化學習的目的 31
5.2 Actor-Critic法的實行 31
第6章 實驗結果 34
6.1 四軸飛行器PID測試平台實驗 34
6.2 基於強化學習之PID控制參數實驗結果 35
第7章 結論與未來展望 38
參考文獻 39
[1]R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press Cambridge, vol. 1, no. 1, 1998.
[2]J. Li and Y. Li, “Dynamic analysis and PID control for a quadrotor,” 2011 IEEE Int’l Conf. Mechatronics and Automation (ICMA), pp. 573-578, 2011.
[3]Salih, A.L. Moghavvemi, M. Mohamed, H.A.F. Gaeid, K.S. , "Modelling and PID controller design for a quadrotor unmanned air vehicle," Automation Quality and Testing Robotics (AQTR), 2010 IEEE International Conference on , vol.1, no., pp.1-5, 28-30 May 2010.
[4]Jun Li, Yuntang Li, "Dynamic analysis and PID control for a quadrotor", Mechatronics and Automation (ICMA) 2011 International Conference on, pp. 573-578, 2011, ISSN 2152-7431.
[5]S. Bouabdallah, A. Noth, and R. Siegwart, "PID vs LQ control techniques applied to an indoor micro quadrotor," IEEE/RSJ Int'l Conf. Intelligent Robots and Systems (IROS 2004), vol. 3, pp. 2451-2456, 2004.
[6]T. Madani and A. Benallegue, “Backstepping control for a quadrotor helicopter,” IEEE/RSJ Int’l Conf. Intelligent Robots and Systems (IROS 2006, pp. 3255-260, 2006.
[7]C. Nicol, C. J. B. Macnab, and A. Ramirez-Serrano, “Robust adaptive control of aquadrotor helicopter,” Mechatronics, vol. 21, no. 6, pp. 927–938, 2011.
[8]D. Lee, H. J. Kim, and S. Sastry, “Feedback linearization vs. adaptive sliding mode control for a quadrotor helicopter,” Int. J. Control, Autom., Syst., vol. 7, no. 3, pp. 419–428, 2009.
[9]C. Diao, B. Xian, Q. Yin, W. Zeng, H. Li, and Y. Yang, “A nonlinear adaptive control approach for quadrotor UAVs,” 2011 IEEE 8th Asian Control Conference (ASCC), pp. 223-228, 2011.
[10]S. Bouabdallah, P. Murrieri, and R. Siegwart, “Design and control of an indoor micro quadrotor,” IEEE Int’l Conf. Robotics and Automation (ICRA’04), vol. 5, pp. 4393-4398, 2004.
[11]S. Grzonka, et al. “A Fully Autonomous Indoor Quadrotor”. IEEE Transactions on Robotics 28(1): 90-100, 2012
[12]S. Ross, N. Melik-Barkhudarov, K. S. Shankar, A. Wendel, D. Dey, J. A. Bagnell and M. Hebert, “Learning Monocular Reactive UAV Control in Cluttered Natural Environments,” Proceedings of IEEE International Conference on Robotics and Automation (ICRA2013), pp. 1765-1772, 2013.
[13]A. Cherian, J. Andersh, V. Morellas and B. Mettler, “Autonomous Altitude Estimation of aUAV using a Single Onboard Camera,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3900-3905, 2009.
[14]H. Kimura and S. Kobayashi: “An analysis of actor-critic algorithms using eligibility traces: Reinforcement learning with imperfect value functions,” Japanese Society for Aritificial Intelligence, 2000.
[15]R. S. Sutton, “Learning to Predict by The methods of Temporal Differences”, Machine Learning 3: 9-44, 1988
[16]A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, no. 5, pp. 834–846, 1983.
[17]Hamilton, W.R. 1969. Elements of quaternions. New York: Chelsea.
[18]C. Y. Huang, “Compliance Locomotion Control for a Quadruped Robot with Damper and Spring Coefficients Assigned by Reinforcement Learning”, Toyota Technological Institute Master’s Thesis, 2017
[19]H. Kimura, T. Yamashita, S. Kobayashi, “Reinforcement learning of walking behavior for a four-legged robot”, In Proc. of 40th IEEE Conference on Decision and Control, pp 411-416, 2001
[20]E. Jucker, “Equations fondamentales des micromoteurs courant continu avec rotor sans fer”. Bulletin technique Portescap, La Chaud-de-Fonds, 1974.
[21]T. Hester, M. Quinlan, P. Stone, “Generalized model learning for Reinforcement Learning on a humanoid robot”, IEEE International Conference on Robotics and Automation (ICRA), 2010.
[22]M. A. K. Jaradat, M. Al-Rousan , L. Quadan, “Reinforcement based mobile robot navigation in dynamic environment”, Robotics and Computer-Integrated Manufacturing, vol 27, pp 135-149, 2011
[23]P. Fankhauser, M. Hutter, C. Gehring, M. Bloesch, M. A. Hoepflinger, R. Siegwart, “Reinforcement learning of single legged locomotion”, IEEE International Conference on Intelligent Robots and Systems (IROS), Nov. 2013, pp. 188–193.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top