跳到主要內容

臺灣博碩士論文加值系統

(2600:1f28:365:80b0:90c8:68ff:e28a:b3d9) 您好!臺灣時間:2025/01/16 08:01
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王鼎鈞
研究生(外文):WANG, TING CHUN
論文名稱:基於深度強化學習方法之自駕車坑洞避障
論文名稱(外文):Pothole Obstacle Avoidance Based on Deep Reinforcement Learning Method
指導教授:賴冠廷賴冠廷引用關係
指導教授(外文):LAI, KUAN-TING
口試委員:黃育賢黃志勝賴冠廷
口試委員(外文):HWANG, YUH-SHYANHUANG, CHIH-SHENGLAI, KUAN-TING
口試日期:2022-07-14
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電子工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:英文
論文頁數:44
中文關鍵詞:坑洞避障強化學習自動駕駛
外文關鍵詞:Pothole Obstacle AvoidanceReinforcement LearningSelf-drive
相關次數:
  • 被引用被引用:0
  • 點閱點閱:523
  • 評分評分:
  • 下載下載:79
  • 收藏至我的研究室書目清單書目收藏:0
現今科技逐漸發達,自動駕駛的技術也進步許多。目前已經能夠做到自行轉彎、變換道路、加速等等的動作,不過還是有很多狀況需要克服,其中最重要的是在行駛的過程中車輛需要隨時注意周遭路況,避免危險的駕駛行為,例如:撞擊坑洞,由於道路因長時間的使用下造成老化損壞而產生坑洞,可能使車輛在未注意路上坑洞的情況下造成撞擊,令車輛及乘客陷入危險的處境。為此本文提出用深度強化學習的方式訓練自駕車進行坑洞避障,由於安全性上的考量,因此透過虛擬環境模擬自駕車坑洞避障。
為了資料的多樣性及可塑性,本實驗依照現實的坑洞尺寸透過虛擬環境製作出與現實相符的坑洞道路模型並使用於自駕車的訓練環境之中,並且同時使用DQN、A2C、PPO這三種不同的演算法來比較,找尋更好的結果。

With the advancement of technology, autonomous driving technology has also advanced a lot. At present, it has been able to perform actions such as turning, changing roads, accelerating, etc., but there are still many situations to overcome. The most important thing is that the vehicle needs to pay attention to the surrounding road conditions at any time during the driving process to avoid dangerous driving behaviors, such as Collision potholes, due to the aging and damage of the road due to long-term use, may cause the vehicle to collide without paying attention to the potholes on the road, putting the vehicle and passengers into a dangerous situation. For this reason, this article proposes to use deep reinforcement learning to train self-driving cars to avoid obstacles in potholes. Due to safety considerations, the virtual environment is used to simulate self-driving cars to avoid obstacles in potholes.
For the diversity and plasticity of the data, in this experiment, according to the actual pothole size, a pothole road model that is consistent with reality is produced through a virtual environment and used in the training environment of self-driving cars, and three kinds of DQN, A2C, and PPO are used at the same time. Compare different algorithms to find better results.

摘 要 i
ABSTRACT iii
Acknowledgments iv
Table of Contents v
List of Tables viii
List of Figures ix
Chapter 1. Introduction 1
1.1 Research Background 1
1.2 Research Motivations 1
1.3 Thesis Contribution 2
1.4 Thesis Architecture 2
Chapter 2. Literature Discussion 3
2.1 Deep Learning 3
2.2 Convolutional Neural Networks 3
2.2.1 Convolution Layer 4
2.2.2 Pooling Layer 7
2.2.3 Fully Connected Layer 9
2.3 Reinforcement Learning 9
2.3.1 Agent 10
2.3.2 Environment 10
2.3.3 State 11
2.3.4 Action 11
2.3.5 Reward 11
2.3.6 Value-base Method 11
2.3.7 Policy-base Method 15
2.4 Deep Q-Learning 17
2.5 Proximal Policy Optimization 18
2.6 Advantage Actor Critic 19
Chapter 3. Data Construction 21
3.1 Rhinoceros 3D 21
3.2 Asphalt Concrete Pavement 21
3.3 Pothole Measurement 24
3.4 Single Pothole Simulation Design 25
3.5 Pothole Pavement Simulation Design 27
Chapter 4. System Implementation 29
4.1 System Architecture 29
4.2 Hardware 29
4.3 Unreal Engine 30
4.3.1 Simulated Environment 30
4.4 AirSim 33
4.4.1 Image API 33
4.4.2 FOV Setting 34
4.4.3 Action Set 35
4.5 Stable Baseline 35
4.5.1 CNN Policy 35
Chapter 5. Experiment Results 36
5.1 Deep Q-Learning 36
5.2 Proximal Policy Optimization 38
5.3 Advantage Actor Critic 40
Chapter 6. Conclusion and Future Work 42
References 43

[1]I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016.
[2]R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, "Convolutional neural networks: an overview and application in radiology," Insights into Imaging, vol. 9, no. 4, pp. 611-629, 2018/08/01 2018, doi: 10.1007/s13244-018-0639-9.
[3]savyakhosla. "CNN | Introduction to Pooling Layer." https://www.geeksforgeeks.org/cnn-introduction-to-pooling-layer/ (accessed 2022).
[4]R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. A Bradford Book, 2018.
[5]M. Naeem, S. T. H. Rizvi, and A. Coronato, "A Gentle Introduction to Reinforcement Learning and its Application in Different Fields," IEEE Access, vol. 8, pp. 209320-209344, 2020, doi: 10.1109/ACCESS.2020.3038605.
[6]D. Silver et al., "Mastering the game of Go with deep neural networks and tree search," Nature, vol. 529, no. 7587, pp. 484-489, 2016/01/01 2016, doi: 10.1038/nature16961.
[7]C. Watkins and P. Dayan, "Technical Note: Q-Learning," Machine Learning, vol. 8, pp. 279-292, 05/01 1992, doi: 10.1007/BF00992698.
[8]R. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy Gradient Methods for Reinforcement Learning with Function Approximation," Adv. Neural Inf. Process. Syst, vol. 12, 02/16 2000.
[9]V. Mnih et al., "Playing Atari with Deep Reinforcement Learning," p. arXiv:1312.5602. [Online]. Available: https://ui.adsabs.harvard.edu/abs/2013arXiv1312.5602M
[10]J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, "Trust region policy optimization," in International conference on machine learning, 2015: PMLR, pp. 1889-1897.
[11]N. Heess et al., "Emergence of locomotion behaviours in rich environments," arXiv preprint arXiv:1707.02286, 2017.
[12]C. C.-Y. Hsu, C. Mendler-Dünner, and M. Hardt, "Revisiting design choices in proximal policy optimization," arXiv preprint arXiv:2009.10897, 2020.
[13]T. Zhang, M. Xiao, Y.-b. Zou, J.-d. Xiao, and S.-y. Chen, "Robotic curved surface tracking with a neural network for angle identification and constant force control based on reinforcement learning," International Journal of Precision Engineering and Manufacturing, vol. 21, no. 5, pp. 869-882, 2020.
[14]R. M. Associates. "Rhinoceros 3D." https://www.rhino3d.com/features/ (accessed 2022).
[15]"瀝青混凝土路面施工及檢驗基準 ", 2020/6. Accessed: 2022.
[16]陳建旭、劉韋廷、廖敏志、王慶雄、蔡益智、林和志, "分析鋪面坑洞產生原因與建議維護方法," 2011/10. Accessed: 2022.
[17]U. Engine. "Unreal Engine." https://docs.unrealengine.com/4.26/zh-CN/ (accessed 2022).
[18]Microsoft. "Aerial Informatics and Robotics Platform (AirSIm)." https://microsoft.github.io/AirSim/ (accessed 2022).
[19]A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann, "Stable-baselines3: Reliable reinforcement learning implementations," Journal of Machine Learning Research, 2021.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top