跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.87) 您好!臺灣時間:2024/12/05 21:41
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:楊皓宇
研究生(外文):Hao-Yu Yang
論文名稱:深度強化學習之高速公路主線與匝道聯合儀控策略—以國道5號為例
論文名稱(外文):Developing Freeway Mainline Metering Policy by Deep Reinforcement Learning Combined with Ramp Metering — A Case Study of Freeway No.5
指導教授:許添本許添本引用關係
指導教授(外文):Tien-Pen Hsu
口試委員:吳健生胡守任
口試委員(外文):Chien-Sheng WUShou-Ren Hu
口試日期:2021-09-10
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:土木工程學研究所
學門:工程學門
學類:土木工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:122
中文關鍵詞:深度強化學習主線儀控匝道儀控
外文關鍵詞:Deep reinforcement learningMainline meteringRamp metering
DOI:10.6342/NTU202103596
相關次數:
  • 被引用被引用:1
  • 點閱點閱:327
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
國道5號高速公路為聯絡台灣東部的重要道路,因直接穿越雪山山脈相較其他聯絡道路行車時間大幅減少,吸引許多遊客利用該道路前往宜花東地區旅遊,造成國道5號在周末常態性的壅塞。這些壅塞將導致高速公路運作的效率降低,帶來龐大的社會成本。而高公局為此也提出許多需求管理策略,如匝道儀控、機動開放路肩大客車專用道、主線儀控、高乘載管制等。
高公局目前控制匝道及主線的儀控率的方法,是依據車流回堵的長度進行動態查表法。本研究認為當前之管理措施能加以改進,遂利用深度強化學習方法建構聯合儀控策略,達成減緩或避免雪山隧道內發生壅塞。深度強化學習可免除對交通模型的假設,且透過神經網路帶入具有時空特性的交通特徵資料,能針對變化迅速的交通環境做出最佳決策。
本研究提出深度強化學習結合ALINEA匝道儀控的聯合儀控模型,並藉由Vissim車流模擬軟體進行策略學習。主線儀控代理會觀察高速公路各路段的流量、速率及密度資料,即時做出最佳決策,並配合獨立運作的匝道儀控系統,共同控制高速公路儀控管理策略。本研究以最小化車輛旅行時間為目標,將負的路網車輛數作為學習獎勵。而為了防堵主線儀控造成高速公路主線嚴重的車流回堵,另加入主線等候車輛數做為學習懲罰。聯合儀控模型經過500回合的訓練,模型得以收斂,最後與高公局現行策略及MRC PI-ALINEA模型分別進行比較。本研究提出之聯合儀控策略與高公局現行策略相比,小客車效率平均下降1.03%,大客車效率平均提升16.53%;與MRC PI-ALINEA模型相比小客車效率平均下降25.13 %,大客車效率平均提升25.09 %。顯示本研究能擴大小客車與大客車之旅行時間差異。而從時空速率圖分析,在啟動儀控後一個小時,本研究確實能有效舒緩雪山隧道內車流壅塞情形,並使路段平均速率達70kph。
Freeway No. 5 is the most important road connecting the eastern of Taiwan, and because it directly passes through the Xue Mountain, the travel time of going to east has been greatly reduced compared with other routes. As a result, many tourists use this freeway to travel to the eastern of Taiwan, causing the weekends’ recurrent congestion. Traffic congestions will reduce the efficiency of freeway operation and bring huge social costs. In order to alleviate this problem, the Freeway Bureau proposed many traffic demand management strategies, such as ramp metering, motorized shoulder bus lane, main metering, high-occupancy regulation policy, and so on.
However, the Freeway Bureau’s control policy of ramp metering and mainline metering is a dynamic look-up method based on the length of the traffic jam. This research believes that the current management measure can be improved by developing a metering control policy using deep reinforcement learning method in order to relieve or avoid congestion in the Xueshan Tunnel. Deep reinforcement learning (DRL) has the ability to tackle the dynamic and complex traffic control problem. Moreover, it can eliminate the assumption of traffic flow models, and can feed traffic data with temporal and spatial characteristics into the models through neural networks.
This research developed a freeway mainline metering policy by deep reinforcement learning combined with ramp metering, which includes mainline metering policy controlled by DRL agent and ramp metering policy controlled by ALINEA algorithm. Vissim, the traffic simulation software was used for model learning and evaluating. The DRL agent makes the best decision in real time based on the flow, speed and density data of each section on the freeway, and works with the ramp metering policy independently. This study considered minimum vehicle travel time as learning target. Then defined the DRL agent’s learning reward as the weighted sum reward function which contains negative number of network vehicles and negative number of waiting vehicles on the freeway mainline. After 500 training episodes, the model has converged. Finally, the proposed model was compared to the current strategy and the MRC PI-ALINEA model respectively. The Q-ratio of passenger cars is decreased by 1.03% and the Q-ratio of buses is increased by 16.53% compared to the current strategy. When compared to the MRC PI-ALINEA model, the Q-ratio of passenger cars is decreased by 25.09% and the Q-ratio of buses is increased by 25.13%. These results show that the joint metering control policy proposed by this research expands the travel time difference between buses and passenger cars. Furthermore, the average speed of Xueshan Tunnel can reach 70kph after the metering started one hour later, and proved it can effectively alleviate traffic congestion.
謝誌 i
摘要 ii
Abstract iii
目錄 v
圖目錄 viii
表目錄 xi
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究目的 4
1.4 研究範圍 5
1.5 研究流程 6
第二章 文獻回顧 8
2.1 主線儀控及聯合儀控 8
2.2 匝道儀控策略及演算法 10
2.2.1 儀控策略(Metering Policy) 10
2.2.2 匝道儀控演算法 13
2.3 多代理強化學習 22
2.4 結論 24
第三章 國道5號交通現況分析及路網建構 26
3.1 國道5號交通現況分析 26
3.1.1 資料說明 26
3.1.2 主線速率時空圖分析 28
3.1.3 路段瓶頸點分析 32
3.1.4 交通績效分析 36
3.2 國道5號路網建構 40
3.2.1 研究範圍之資料蒐集 42
3.2.2 Vissim路網建構 44
3.2.3 駕駛行為參數校估 46
3.2.4 路網驗證 52
第四章 研究方法 54
4.1 強化學習(Reinforcement Learning) 54
4.2 神經網路(Neural Network) 59
4.3 深度Q網路(Deep Q-Network) 64
4.4 多代理學習之獨立Q網路(Independent DQN,IDQN) 67
4.5 高速公路主線與匝道聯合儀控模型 68
4.5.1 聯合儀控之主線儀控模型 69
4.5.2 聯合儀控之匝道儀控模型 74
4.6 基礎模型(Baseline Model) 77
第五章 聯合儀控模型訓練結果及情境分析 81
5.1 模型訓練之設定與訓練結果 81
5.1.1 模型訓練之設定 81
5.1.2 模型之訓練結果 84
5.2 績效評比與實際情境比較 91
5.3 績效評比與基礎模型(MRC PI-ALINEA)比較 96
第六章 結論與建議 101
6.1 結論 101
6.2 研究限制 102
6.3 建議 102
參考文獻 104
附錄一 111
附錄二 115
附錄三 119
蘇振維、張舜淵、楊幼文、歐陽恬恬(2017)。北宜運輸路廊供需體檢。交通部運輸研究所合作研究計畫(編號:MOTC-IOT-104-PBA036)。交通部運輸研究所。
交通部(2021)。高速公路行經各收費路段之通行量。取自https://stat.motc.gov.tw/mocdb/stmain.jsp?sys=100。
卓明君、陳廷才、張崇智、李興志(2016)。國5北上宜蘭至頭城正式實施大客車通行路肩及主線儀控措施。交通部臺灣區國道高速公路局交通管理組年度工作報告,未出版。
內政部國土測繪圖資服務雲(製圖者) (2021)。國道5號高速公路航照圖。取自https://maps.nlsc.gov.tw/。
交通部高速公路局(2010)。交通工程手冊(號誌、交通安全防護設施及照明篇)。https://www.freeway.gov.tw/Publish.aspx?cnid=3412&p=16578。
Neudorff, L. G., Randall, J., Reiss, R. A., & Gordon, R. L. (2003). Freeway management and operations handbook (No. FHWA-OP-04-003). United States. Federal Highway Administration. Office of Transportation Management.
Papageorgiou, M., Hadj-Salem, H., & Blosseville, J. M. (1991). ALINEA: A local feedback control law for on-ramp metering. Transportation research record, 1320(1), 58-67.
Masher, D. P., Ross, D. W., Wong, P. J., Tuan, P. L., Zeidler, H. M., & Petracek, S. (1975). Guidelines for design and operation of ramp control systems.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484-489.
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., ... & Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350-354.
Farazi, N. P., Ahamed, T., Barua, L., & Zou, B. (2020). Deep Reinforcement Learning and Transportation Research: A Comprehensive Review. arXiv preprint arXiv:2010.06187.
Neudorff, L. G., Randall, J., Reiss, R. A., & Gordon, R. L. (2003). Freeway management and operations handbook (No. FHWA-OP-04-003). United States. Federal Highway Administration. Office of Transportation Management.
Jacobson, E. L., & Landsman, J. (1994). Case studies of US freeway-to-freeway ramp and mainline metering and suggested policies for Washington State (No. 1446).
Ghiasi, A., Hale, D., Bared, J., Kondyli, A., & Ma, J. (2018, November). A Dynamic Signal Control Approach for Integrated Ramp and Mainline Metering. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC) (pp. 2892-2898). IEEE.
Jiang, N., & Tang, S. (2009, April). An Approach of Controlling Both Freeway On-ramp and Mainline. In 2009 International Joint Conference on Artificial Intelligence (pp. 683-686). IEEE.
譚偉倫(2018)。大眾運輸優先下之高速公路壅塞區段整合交通控制策略之研究。碩士論文。私立淡江大學運輸管理學系運輸科學碩士班。
Mizuta, A., Roberts, K., Jacobsen, L., Thompson, N., & Colyar, J. (2014). Ramp metering: a proven, cost-effective operational strategy: a primer (No. FHWA-HOP-14-020). United States. Federal Highway Administration.
Papageorgiou, M., & Papamichail, I. (2008). Overview of traffic signal operation policies for ramp metering. Transportation Research Record, 2047(1), 28-36.
Papageorgiou, M., Hadj-Salem, H., & Blosseville, J. M. (1991). ALINEA: A local feedback control law for on-ramp metering. Transportation research record, 1320(1), 58-67.
Wang, Y., Kosmatopoulos, E. B., Papageorgiou, M., & Papamichail, I. (2014). Local ramp metering in the presence of a distant downstream bottleneck: Theoretical analysis and simulation study. IEEE Transactions on Intelligent Transportation Systems, 15(5), 2024-2039.
Sun, J., Li, T., Yu, M., & Zhang, H. M. (2018). Exploring the congestion pattern at long-queued tunnel sag and increasing the efficiency by control. IEEE Transactions on Intelligent Transportation Systems, 19(12), 3765-3774.
Taylor, C., Meldrum, D., & Jacobson, L. (1998). Fuzzy ramp metering: Design overview and simulation results. Transportation Research Record, 1634(1), 10-18.
鐘仁傑(2007)。模糊邏輯匝道儀控模式-細胞自動機之模擬分析。碩士論文。國立交通大學交通運輸研究所碩士班。
Yu, X. F., Xu, W. L., Alam, F., Potgieter, J., & Fang, C. F. (2012, November). Genetic fuzzy logic approach to local ramp metering control using microscopic traffic simulation. In 2012 19th International Conference on Mechatronics and Machine Vision in Practice (M2VIP) (pp. 290-297). IEEE.
吳榮顯(2003)。連續路口之適應性基因模糊邏輯號誌控制系統。碩士論文。國立交通大學交通運輸研究所碩士班。
Bellemans, T., De Schutter, B., & De Moor, B. (2004). Anticipative ramp metering control for freeway traffic networks. Available on-line from http://www. mtns2004. be/database/papersubmission/upload/320. pdf. Accessed August.
Hegyi, A., De Schutter, B., & Hellendoorn, H. (2005). Model predictive control for optimal coordination of ramp metering and variable speed limits. Transportation Research Part C: Emerging Technologies, 13(3), 185-209.
張鈞凱(2014)。高速公路可變速限聯合匝道儀控最佳化模式。碩士論文。國立台灣大學土木工程學研究所交通組。
Davarynejad, M., Hegyi, A., Vrancken, J., & van den Berg, J. (2011, October). Motorway ramp-metering control with queuing consideration using Q-learning. In 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC) (pp. 1652-1658). IEEE.
Jacob, C., & Abdulhai, B. (2010). Machine learning for multi-jurisdictional optimal traffic corridor control. Transportation Research Part A: Policy and Practice, 44(2), 53-64.
Rezaee, K., Abdulhai, B., & Abdelgawad, H. (2013). Self-learning adaptive ramp metering: Analysis of design parameters on a test case in Toronto, Canada. Transportation research record, 2396(1), 10-18.
Fares, A., & Gomaa, W. (2014, June). Freeway ramp-metering control based on reinforcement learning. In 11th IEEE International Conference on Control & Automation (ICCA) (pp. 1226-1231). IEEE.
Zhou, Y., Ozbay, K., Kachroo, P., & Zuo, F. (2020). Ramp Metering for a Distant Downstream Bottleneck Using Reinforcement Learning with Value Function Approximation. Journal of Advanced Transportation, 2020.
Watkins, C. J. C. H. (1989). Learning from delayed rewards.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
Lu, C., Huang, J., Deng, L., & Gong, J. (2017). Coordinated ramp metering with equity consideration using reinforcement learning. Journal of Transportation Engineering, Part A: Systems, 143(7), 04017028.
Rezaee, K. (2014). Decentralized coordinated optimal ramp metering using multi-agent reinforcement learning. University of Toronto (Canada).
Calvo, J. A., & Dusparic, I. (2018, August). Heterogeneous Multi-Agent Deep Reinforcement Learning for Traffic Lights Control. In AICS (pp. 2-13).
Foerster, J., Nardelli, N., Farquhar, G., Afouras, T., Torr, P. H., Kohli, P., & Whiteson, S. (2017, July). Stabilising experience replay for deep multi-agent reinforcement learning. In International conference on machine learning (pp. 1146-1155). PMLR.
Lin, Y., McPhee, J., & Azad, N. L. (2020). Comparison of deep reinforcement learning and model predictive control for adaptive cruise control. IEEE Transactions on Intelligent Vehicles, 6(2), 221-231.
交通部高速公路局(2021)。「交通資料庫」。取自https://tisvcloud.freeway.gov.tw/。
交通部高速公路局(2021)。交流道服務區里程一覽表。取自https://www.freeway.gov.tw/Publish.aspx?cnid=1906&p=4621。
蘇振維、楊幼文、林邏耀、歐陽恬恬、周諺鴻、李永駿、邱詩純、顏郁航、曾依蘋、翁忠川、高啟涵、蘇怡如、鐘靈、吳中銘(2016)。以大數據技術建置宜蘭地區交通管理預警機制之應用服務。交通部運輸研究所合作研究計畫(編號:IOT-104-PDF006)。交通部運輸研究所。
許添本、吳佳紋、謝宗軒、林育瑞 (2010)。利用流量守恆與密度變化尋找偵測器之錯誤。中華民國運輸學會99年年會暨學術論文國際研討會。
汪進財、邱孟佑(2011)。以交通狀態為基礎之遺漏值補正策略。運輸學刊, 23(2),239-270。
林豐博、蘇振維(2009)。國道 5 號雪山隧道車流特性之研究。運輸計劃季刊, 38(1),85-119。
林豐博、張舜淵、楊幼文、歐陽恬恬、謝秉叡、陳怡妏(2019)。公路交通系統模擬模式調校與新版容量手冊研訂(3/3)。交通部運輸研究所合作研究計畫(編號:MOTC-IOT-107-PEB011)。交通部運輸研究所。
Papageorgiou, M., & Kotsialos, A. (2002). Freeway ramp metering: An overview. IEEE transactions on intelligent transportation systems, 3(4), 271-281.
蘇振維、鄭嘉盈、呂怡青、林豐博、曾平毅、楊信毅、黃昶斌、張筱瑜(2012)。高快速公路收費站、隧道及坡度路段容量及車流特性研究(2/3)。交通部運輸研究所合作研究計畫(編號:MOTC-IOT-100-PEB011)。交通部運輸研究所。
交通部高速公路局(2021)。歷史統計分析軟體總旅行時間指標。取自http://61.60.107.9/TDCS/TravelTimeIndex。
Lee, J., Park, B., Won, J., & Yun, I. (2013). A Simplified Procedure for Calibrating Microscopic Traffic Simulation Models (No. 13-4190).
交通部高速公路局(2021)。交通管理措施。取自https://www.freeway.gov.tw/Publish.aspx?cnid=183。
江宜穎(2013)。高速公路壅塞模擬與主線速率漸變控制模式之研究。碩士論文。國立台灣大學土木工程學研究所交通組。
PTV system Software and Consulting GmbH. (2020). VISSIM-User Manual 2000. Karlsruhe, Germany.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
Lewis, C. D. (1982). Industrial and business forecasting methods: A practical guide to exponential smoothing and curve fitting. Butterworth-Heinemann.
Bhandare, A., Bhide, M., Gokhale, P., & Chandavarkar, R. (2016). Applications of convolutional neural networks. International Journal of Computer Science and Information Technologies, 7(5), 2206-2215.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.
Van Hasselt, H., Guez, A., & Silver, D. (2016, March). Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).
Schaul, T., Quan, J., Antonoglou, I., & Silver, D. (2015). Prioritized experience replay. arXiv preprint arXiv:1511.05952.
Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning (pp. 330-337).
Matignon, L., Laurent, G. J., & Le Fort-Piat, N. (2012). Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. The Knowledge Engineering Review, 27(1), 1-31.
Spiliopoulou, A. D., Papamichail, I., & Papageorgiou, M. (2010). Toll plaza merging traffic control for throughput maximization. Journal of Transportation Engineering, 136(1), 67-76.
蘇振維、歐陽恬恬、林豐博、曾平毅、陳冠男、林佳韻(2016)。公路坡度路段模擬模式之發展及應用(3/3)。交通部運輸研究所合作研究計畫(編號:MOTC-IOT-104-PEB011)。交通部運輸研究所。
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊