跳到主要內容

臺灣博碩士論文加值系統

(44.200.194.255) 您好!臺灣時間:2024/07/23 14:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:謝采峰
研究生(外文):HSIEH,TSAI-FENG
論文名稱:運用深度強化學習解決租賃車輛訂單配置問題
論文名稱(外文):Deep Reinforcement Learning for Vehicle-reservation Assignment Problem
指導教授:周忠信周忠信引用關係
指導教授(外文):JWO, JUNG-SING
口試委員:鄭有進李政雄
口試委員(外文):CHENG, YU-CHINLEE, CHENG-HSIUNG
口試日期:2023-12-20
學位類別:碩士
校院名稱:東海大學
系所名稱:數位創新碩士學位學程
學門:工程學門
學類:其他工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:中文
論文頁數:38
中文關鍵詞:分配問題空調度深度強化學習
外文關鍵詞:Assignment problemEmpty repositionsDeep reinforcement learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:31
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
租車正成為越來越多年輕人出遊的選擇,熱門時段的需求增加導致各租車站點車輛不平衡嚴重。租車公司須緊急調度車輛,以滿足大量的訂單需求,即昂貴的空調度。這種情況造成公司業務成本急遽攀升,成為公司經營上的一大痛點。租車業的車輛不平衡問題被定義為車輛訂單分配問題,該問題涉及一組已知的待分配訂單與一組異質車隊。異質車隊被分為多個車輛群組,而訂單包含租借、歸還車輛的站點與時間,以及特定車輛群組的需求。有效地將訂單分派給合適的車輛,以最小化空調度成本同時最大化整體收入是解決此問題的關鍵目標。本論文提出一種結合先進先出策略和深度強化學習的兩階段優化框架,以遞進優化方式調整訂單分配狀況。本論文通過模擬實驗取得比過去研究在訂單總收入上高出37%,並在模型訓練後的訂單分配執行效率達到18421%的提升。
Car rental is increasingly becoming the choice for many young travelers. The surge in demand during peak times has led to severe vehicle imbalances across rental stations. Rental company must urgently dispatch vehicle to meet the influx of reservations, namely expensive empty repositions. This situation has caused a rapid escalation in operational costs, becoming a significant pain point in the company's operations. The imbalance issue in the car rental industry is defined as a vehicle-reservation assignment problem. It involves a set of known reservations and a heterogeneous fleet to be allocated. The heterogeneous fleet is divided into multiple vehicle groups, while reservations consist of stations and time for renting and returning vehicles, along with the demand for specific vehicle groups. In order to effectively assign reservation to suitable vehicles, minimize the cost of imbalance while maximizing overall revenue is the objective in solving this issue. This paper proposes a two-stage optimization framework that combines the first-in-first-out strategy and deep reinforcement learning to progressively optimize reservation allocations. We achieved a 37% improvement compared to previous studies through simulated experiments, demonstrating an astonishing 18,421% enhancement in efficiency.
摘要 i
Abstract ii
致謝 iii
目錄 iv
表目錄 v
圖目錄 vi
1 第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 3
2 第二章 文獻探討 5
2.1 相關文獻 5
2.2 相關技術 8
3 第三章 強化學習的模型設計 11
3.1 問題定義 11
3.2 方案設計 12
3.2.1 FIFO 13
3.2.2 Moving Orders Combination 13
3.3 POMDP建模 14
3.4 RA2C 17
3.4.1 策略網路 18
3.4.2 價值網路 18
3.4.3 策略梯度 19
4 第四章 實驗分析 20
4.1 實驗設定 20
4.2 結果比較 20
5 第五章 結論 24
參考文獻 25
[1]F. Ferrero, G. Perboli, M. Rosano, and A. Vesco, "Car-sharing services: An annotated review," Sustainable Cities and Society, vol. 37, pp. 501-518, 2018.
[2]K. Uesugi, N. Mukai, and T. Watanabe, "Optimization of vehicle assignment for car sharing system," in Knowledge-Based Intelligent Information and Engineering Systems: 11th International Conference, KES 2007, XVII Italian Workshop on Neural Networks, Vietri sul Mare, Italy, September 12-14, 2007. Proceedings, Part II 11, 2007: Springer, pp. 1105-1111.
[3]A. Luè, A. Colorni, R. Nocerino, and V. Paruscio, "Green move: An innovative electric vehicle-sharing system," Procedia-Social and Behavioral Sciences, vol. 48, pp. 2978-2987, 2012.
[4]B. B. Oliveira, M. A. Carravilla, J. F. Oliveira, and F. M. Toledo, "A relax-and-fix-based algorithm for the vehicle-reservation assignment problem in a car rental company," European Journal of Operational Research, vol. 237, no. 2, pp. 729-737, 2014.
[5]R. Mesa-Arango and S. V. Ukkusuri, "Minimum cost flow problem formulation for the static vehicle allocation problem with stochastic lane demand in truckload strategic planning," Transportmetrica A: Transport Science, vol. 13, no. 10, pp. 893-914, 2017.
[6]D. P. Bertsekas, "A new algorithm for the assignment problem," Mathematical Programming, vol. 21, no. 1, pp. 152-171, 1981.
[7]H. W. Kuhn, "The Hungarian method for the assignment problem," Naval research logistics quarterly, vol. 2, no. 1‐2, pp. 83-97, 1955.
[8]W. Tan, Y. Sun, Z. Ding, and W.-J. Lee, "Fleet management and charging scheduling for shared mobility-on-demand system: A systematic review," IEEE Open Access Journal of Power and Energy, 2022.
[9]D. Gammelli, K. Yang, J. Harrison, F. Rodrigues, F. C. Pereira, and M. Pavone, "Graph neural network reinforcement learning for autonomous mobility-on-demand systems," in 2021 60th IEEE Conference on Decision and Control (CDC), 2021: IEEE, pp. 2996-3003.
[10]Y. Liang, Z. Ding, T. Ding, and W.-J. Lee, "Mobility-aware charging scheduling for shared on-demand electric vehicle fleet using deep reinforcement learning," IEEE Transactions on Smart Grid, vol. 12, no. 2, pp. 1380-1393, 2020.
[11]T. Öncan, "A survey of the generalized assignment problem and its applications," INFOR: Information Systems and Operational Research, vol. 45, no. 3, pp. 123-141, 2007.
[12]D. G. Cattrysse and L. N. Van Wassenhove, "A survey of algorithms for the generalized assignment problem," European journal of operational research, vol. 60, no. 3, pp. 260-272, 1992.
[13]B. B. Oliveira, M. A. Carravilla, and J. F. Oliveira, "A GRASP algorithm for the vehicle-reservation assignment problem," in Computational Management Science: State of the Art 2014, 2016: Springer, pp. 63-71.
[14]M. Zhou et al., "Multi-agent reinforcement learning for order-dispatching via order-vehicle distribution matching," in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 2645-2653.
[15]Y. Wang, J. Liu, and Y. Chang, "Design and Development of Vehicle Scheduling Platform Based on C/S Architecture," in International Workshop of Advanced Manufacturing and Automation, 2022: Springer, pp. 184-190.
[16]D. Huisman, R. Freling, and A. P. Wagelmans, "A robust solution approach to the dynamic vehicle scheduling problem," Transportation Science, vol. 38, no. 4, pp. 447-458, 2004.
[17]E. Kaasinen et al., "Empowering and engaging industrial workers with Operator 4.0 solutions," Computers & Industrial Engineering, vol. 139, p. 105678, 2020.
[18]A. A. Casilli, "Global digital culture| Digital labor studies go global: Toward a digital decolonial turn," International Journal of Communication, vol. 11, p. 21, 2017.
[19]Y. Ma et al., "A hierarchical reinforcement learning based optimization framework for large-scale dynamic pickup and delivery problems," Advances in Neural Information Processing Systems, vol. 34, pp. 23609-23620, 2021.
[20]L. Shi, Y. Zhang, W. Rui, and X. Yang, "Study on the bike-sharing inventory rebalancing and vehicle routing for bike-sharing system," Transportation research procedia, vol. 39, pp. 624-633, 2019.
[21]S. Illgen and M. Höck, "Literature review of the vehicle relocation problem in one-way car sharing networks," Transportation Research Part B: Methodological, vol. 120, pp. 193-204, 2019.
[22]A. Ait-Ouahmed, D. Josselin, and F. Zhou, "Relocation optimization of electric cars in one-way car-sharing systems: modeling, exact solving and heuristics algorithms," International journal of geographical information science, vol. 32, no. 2, pp. 367-398, 2018.
[23]B. Lin, B. Ghaddar, and J. Nathwani, "Deep reinforcement learning for the electric vehicle routing problem with time windows," IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 11528-11538, 2021.
[24]L. Gao, M. Chen, Q. Chen, G. Luo, N. Zhu, and Z. Liu, "Learn to design the heuristics for vehicle routing problem," arXiv preprint arXiv:2002.08539, 2020.
[25]M. Nazari, A. Oroojlooy, L. Snyder, and M. Takác, "Reinforcement learning for solving the vehicle routing problem," Advances in neural information processing systems, vol. 31, 2018.
[26]Y. Wu, W. Song, Z. Cao, J. Zhang, and A. Lim, "Learning improvement heuristics for solving routing problems," IEEE transactions on neural networks and learning systems, vol. 33, no. 9, pp. 5057-5069, 2021.
[27]Z. Xu et al., "Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach," in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 905-913.
[28]K. Lin, R. Zhao, Z. Xu, and J. Zhou, "Efficient large-scale fleet management via multi-agent deep reinforcement learning," in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 1774-1783.
[29]X. Huang, J. Ling, X. Yang, X. Zhang, and K. Yang, "Multi-Agent Mix Hierarchical Deep Reinforcement Learning for Large-Scale Fleet Management," IEEE Transactions on Intelligent Transportation Systems, 2023.
[30]J. Holler et al., "Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem," in 2019 IEEE International Conference on Data Mining (ICDM), 2019: IEEE, pp. 1090-1095.
[31]H. Lu, X. Zhang, and S. Yang, "A learning-based iterative method for solving vehicle routing problems," in International conference on learning representations, 2019.
[32]E. K. Burke et al., "Hyper-heuristics: A survey of the state of the art," Journal of the Operational Research Society, vol. 64, pp. 1695-1724, 2013.
[33]S. J. Russell and P. Norvig, Artificial intelligence a modern approach. London, 2010.
[34]L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, "Planning and acting in partially observable stochastic domains," Artificial intelligence, vol. 101, no. 1-2, pp. 99-134, 1998.
[35]G. Lee, M. Luo, F. Zambetta, and X. Li, "Learning a Super Mario controller from examples of human play," in 2014 IEEE Congress on Evolutionary Computation (CEC), 2014: IEEE, pp. 1-8.
[36]F. Pusse and M. Klusch, "Hybrid online pomdp planning and deep reinforcement learning for safer self-driving cars," in 2019 IEEE Intelligent Vehicles Symposium (IV), 2019: IEEE, pp. 1013-1020.
[37]R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018.
[38]V. Mnih et al., "Playing atari with deep reinforcement learning," arXiv preprint arXiv:1312.5602, 2013.
[39]V. Mnih et al., "Asynchronous methods for deep reinforcement learning," in International conference on machine learning, 2016: PMLR, pp. 1928-1937.
[40]A. K. Shakya, G. Pillai, and S. Chakrabarty, "Reinforcement Learning Algorithms: A brief survey," Expert Systems with Applications, p. 120495, 2023.
[41]J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv preprint arXiv:1707.06347, 2017.
[42]T. Ni, B. Eysenbach, and R. Salakhutdinov, "Recurrent model-free rl can be a strong baseline for many pomdps," arXiv preprint arXiv:2110.05038, 2021.
電子全文 電子全文(網際網路公開日期:20290104)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top