跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/13 20:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王亦凡
研究生(外文):I-Fan Wang
論文名稱:深度強化學習於適應性號誌控制之研究
論文名稱(外文):Research on Deep Reinforcement Learning for Adaptive Traffic Signal Control
指導教授:陳惠國陳惠國引用關係
指導教授(外文):Huey-Kuo Chen
學位類別:碩士
校院名稱:國立中央大學
系所名稱:土木工程學系
學門:工程學門
學類:土木工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:中文
論文頁數:56
中文關鍵詞:適應性號誌控制深度強化學習Rainbow DQN交通模擬
外文關鍵詞:adaptive signal controldeep reinforcement learningRainbow DQNtraffic simulation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:38
  • 評分評分:
  • 下載下載:9
  • 收藏至我的研究室書目清單書目收藏:0
本研究旨在探討深度強化學習在適應性號誌控制中的應用,將透過微觀交通模擬軟體Vissim來模擬尖峰時段臺北市路口的車流情境,在考量不同車種當量影響和機車兩段式左轉設計下,建構基於深度強化學習演算法的適應性號誌控制系統,以改善目前市區路口尖峰時段的交通狀態。
架構上將透過深度強化學習網路Rainbow DQN作為號誌控制系統的判斷模型,考量流向基礎之車流狀態和時相狀態,動作選擇以時制順序切換與延長綠燈時間作為號誌控制方式,獎勵目標以最小化路口總壓力,並將結果與定時號誌為基準比較兩者間的路口績效表現。
實驗設計將晨峰和昏峰拆分成各三個不同時段場景訓練,結果顯示透過深度強化學習於適應性號誌控制確實能降低路口之停等長度,在各實驗場景皆可快速收斂於100回合內,並於晨峰尖峰時段改善50%的績效,且模型設計能適應研究設計中市區內不同尖峰時段的車流量,彈性的狀態、動作和獎勵設計能將模型一般化應用於其他場景應用。
This study aims to explore the application of deep reinforcement learning in adaptive traffic signal control. Using the microscopic traffic simulation software Vissim, we simulate the traffic conditions at intersections in Taipei City during peak hours. Considering the effects of different vehicle types and the two-stage left-turn design for motorcycles, we construct an adaptive traffic signal control system based on a deep reinforcement learning algorithm to improve the current traffic conditions at urban intersections during peak hours.
The framework employs the deep reinforcement learning network Rainbow DQN as the decision model for the signal control system. The model considers traffic flow conditions and phase states, with action choices focusing on phase sequence switching and green light extension as control methods. The reward objective is to minimize the total intersection pressure. The system's performance is compared with fixed-time signals as a baseline.
The experimental design splits morning and evening peaks into three different time periods for training. Results show that deep reinforcement learning in adaptive traffic signal control effectively reduces waiting times at intersections. The model converges quickly within 100 episodes across all experimental scenarios and improves performance by 50% during peak morning hours. Furthermore, the model design can adapt to varying traffic volumes during different peak periods in urban areas, with flexible state, action, and reward designs enabling generalization to other scenarios.
摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章 緒論 1
第二章 文獻回顧 4
2.1強化學習於號誌控制的應用 4
2.2類神經網路架構設計 5
2.3強化學習機制 5
第三章 研究方法 8
3.1深度強化學習算法 8
3.2價值基礎之Rainbow DQN 8
3.2.1 Double DQN 9
3.2.2 Prioritized Experience Replay 10
3.2.3 Dueling Network 11
3.2.4 Distributional DQN 12
3.2.5 Noisy Net 13
3.2.6 n步學習 14
第四章 模型與實驗設計 16
4.1強化學習之模型設計 16
4.1.1代理人設計 16
4.1.2類神經網路架構 19
4.1.3訓練流程 24
4.2研究範圍 27
4.2.1使用資料 28
4.2.2模擬軟體 29
4.3實驗設計 29
4.4號誌控制於模擬場景 30
第五章 實驗訓練結果 31
5.1訓練績效 31
5.1.1等候長度和停等延滯 31
5.1.2車輛數分析 33
5.1.3損失分析 35
5.2車種當量設定比較 36
第六章 結論與建議 37
6.1結論 37
6.2建議 38
第七章 參考文獻 39
附錄 43
[1] 李秉原,2023,應用價值基礎之元強化學習方法於交通號誌控制之研究,國立中央大學土木工程系碩士論文。
[2] 胡守任、葉志韋、林定憲、劉瀚聰,2020,都市適應性號誌控制原理與發展,土木水利,第四十七卷,第四期,第28-39頁。
[3] 陳惠國,2022,強化學習應用於交通號誌控制之展望,中華道路季刊,第六十一卷,第四期,第43-54頁。
[4] Abdoos, M., Mozayani, N. and Bazzan, A. L., 2013, Holonic multi-agent system for traffic signals control. Engineering Applications of Artificial Intelligence, Vol.26, No.5, pp.1575–1587.
[5] Abdulhai, B., Pringle, R. and Karakoulas G. J., 2003, Reinforcement learning for true adaptive traffic signal control, Journal of Transportation Engineering, Vol.129, No.3, pp.278–285.
[6] Arel, I., Liu, C., Urbanik, T. and Kohls AG., 2010, Reinforcement learning based multi-agent system for network traffic signal control, IET Intelligent Transport Systems, Vol.4, No.2, pp.128–135.
[7] Bakker, B., Whiteson, S., Kester L. and Groen F. C., 2010, Traffic light control by multiagent reinforcement learning systems, Interactive Collaborative Information Systems, pp.475–510.
[8] Chen, C., Wei, H., Xu, N., Zheng, G., Yang, M., Xiong, Y., Xu, K. and Li, Z., 2020, Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control, Proceedings of the AAAI Conference on Artificial Intelligence, Vol.34, No.4, pp.3414–3421.
[9] Dabney, W., Rowland, M., Bellemare, M. and Munos, R., 2018, Distributional reinforcement learning with quantile regression, Proceedings of the AAAI conference on artificial intelligence, Vol.32, No.1.
[10] El-Tantawy, S., Abdulhai, B. and Abdelgawad, H., 2013, Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto, IEEE Transactions on Intelligent Transportation Systems, Vol.14, No.3, pp.1140–1150.
[11] Fortunato, M., Azar, M. G., Piot, B., Menick, J., Osband, I., Graves,A., Mnih, V., Munos, R., Hassabis, D., Pietquin, O., Blundell, C. and Legg, S., 2018, Noisy Networks for Exploration., The Twelfth International Conference on Learning Representations(ICLR).
[12] Gao, J., Shen, Y., Liu, J., Ito, M. and Shiratori, N., 2017, Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network, arXiv:1705.02755.
[13] Genders, W. and Razavi, S., 2016, Using a deep reinforcement learning agent for traffic signal control, arXiv:1611.01142.
[14] Hasselt, V., Hado, Guez, A. and Silver, D., 2016, Deep reinforcement learning with double q-learning, Proceedings of the AAAI conference on artificial intelligence, Vol. 30, No.1.
[15] Hessel, M., Modayil, J., Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M. and Silver, D., 2017, Rainbow: combining improvements in deep reinforcement learning, arXiv:1710.02298.
[16] Li, L., Lv, Y. and Wang, F.Y., 2016, Traffic signal timing via deep reinforcement learning, IEEE/CAA Journal of Automatica Sinica , Vol.3, No.3, pp.247–254.
[17] Liu, M., Deng, J., Xu, M., Zhang, X. and Wang, W., 2017, Cooperative deep reinforcement learning for traffic signal control. In 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Halifax.
[18] Mannion, P., Duggan, J. and Howley, E., 2016, An experimental review of reinforcement learning algorithms for adaptive traffic signal control, Autonomic Road Transport Support Systems, pp.47–66.
[19] Schaul, T., Quan, J., Antonoglou, I. and Silver, D., 2015, Prioritized experience replay, arXiv:1511.05952.
[20] Sutton, R. S. and Barto, A. G., 2018, Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
[21] van der Pol E. and Oliehoek F. A., 2016, Coordinated deep reinforcement learners for traffic light control, Proceedings of learning, inference and control of multi-agent systems (at NIPS), Vol.8, pp.21-38.
[22] Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M. and Freitas, N., 2016, Dueling network architectures for deep reinforcement learning. Proceedings of Machine Learning Research (PMLR), pp.1995-2003.
[23] Wei, H., Xu, N., Zhang, H., Zheng, G., Zang, X., Chen, C., Zhang, W., Zhu, Y., Xu, K. and Li, Z., 2019, PressLight: Learning max pressure control to coordinate traffic signals in arterial network, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), New York, USA, pp.1290–1298.
[24] Wei, H., Zheng, G., Yao, H. and Li, Z., 2018, IntelliLight: A reinforcement learning approach for intelligent traffic light control, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), London, UK, pp.2496-2505.
[25] Wiering, M., 2000, Multi-agent reinforcement learning for traffic light control, In Machine Learning: Proceedings of the Seventeenth International Conference (ICML), pp.1151–1158.
[26] Zang, X., Yao, H., Zheng, G., Xu, N., Xu, K. and Li, Z., 2020, MetaLight: Value-based meta-reinforcement learning for traffic signal control, Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No.1, pp.1153-1160.
[27] Zhang, H., Liu, C., Zhang, W., Zheng, G. and Yu, Y., 2020, Generalight: Improving environment generalization of traffic signal control via meta reinforcement learning, Proceedings of the 29th ACM international conference on information & knowledge management, pp.1783-1792.
[28] Zhao, W., Ye, Y., Ding, J., Wang, T., Wei, T. and Chen, M., 2022, IPDALight: Intensity and phase duration-aware traffic signal control based on reinforcement learning, Journal of Systems Architecture, Vol. 123, pp.102374-102385.
[29] Zheng, G., Xiong, Y., Zang, X., Feng, J., Wei, H., Zhang, H., Li, Y., Xu, K. and Li, Z., 2019, Learning phase competition for traffic signal control, Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp.1963-1972.
[30] Zheng, G., Zang, X., Xu, N., Wei, H., Yu, Z., Gayah, V., Xu, K. and Li, Z., 2019, Diagnosing reinforcement learning for traffic signal control, arXiv:1905.04716.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top