跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.83) 您好!臺灣時間:2025/01/25 18:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:洪愷尹
研究生(外文):Hong, Kai-Yin
論文名稱:結合時序性集成與基於學習的軌跡聚合之多軌跡預測
論文名稱(外文):Multi-modal Motion Prediction using Temporal Ensembling with Learning-based Aggregation
指導教授:王傑智
指導教授(外文):Wang, Chieh-Chih
口試委員:王傑智林文杰邱維辰孫民
口試委員(外文):Wang, Chieh-ChihLin, Wen-ChiehChiu, Wei-ChenSun, Min
口試日期:2024-04-30
學位類別:碩士
校院名稱:國立陽明交通大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2024
畢業學年度:112
語文別:英文
論文頁數:38
中文關鍵詞:自駕車多軌跡預測模型集成高精地圖
外文關鍵詞:Autonomous drivingmulti-modal motion predictionDETRmodel ensemblinghigh-definition map
相關次數:
  • 被引用被引用:0
  • 點閱點閱:17
  • 評分評分:
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
近年來,在軌跡預測領域,基於深度學習的方法日益受到重視。然而,這些方法在處理不確定性和捕捉多模態分佈時,仍面臨一定的挑戰。本文介紹了一種新的元算法——「結合時序性集成與基於學習的軌跡聚合之多軌跡預測」,旨在解決因遺漏行為導致的連續幀預測時出現不一致的問題。這一方法與傳統的模型集成不同,它通過利用鄰近幀的預測來增強預測的空間覆蓋率和多樣性。通過整合多個幀的預測,時間集成方法能夠彌補單一幀預測中的偶發錯誤。此外,傳統模型集成中所用的軌跡層面聚合往往忽略了交通環境的變化,且容易將候選軌跡中的錯誤行為納入最終預測,這在時間集成方法中是不足夠的。因此,我們進一步強調基於學習的軌跡聚合的重要性,該策略基於DETR(DEtection with TRansformer)架構中的模態查詢,結合鄰近幀預測的特點,來增強時間集成的效果。我們在Argoverse 2數據集上進行的實驗驗證顯示,與當前最優模型QCNet相比,我們的方法在minADE上降低了4%,在minFDE上降低了5%,Miss Rate降低了1.16%,這些結果凸顯了我們所提之方法在自動駕駛領域應用的有效性和潛力。
Recent years have seen a shift towards learning-based methods for trajectory prediction, with challenges remaining in addressing uncertainty and capturing multi-modal distributions. This paper introduces \textit{Temporal Ensembling with Learning-based Aggregation}, a meta-algorithm designed to mitigate the issue of missing behaviors in trajectory prediction, which leads to inconsistent predictions across consecutive frames. Unlike conventional model ensembling, temporal ensembling leverages predictions from nearby frames to enhance spatial coverage and prediction diversity. By confirming predictions from multiple frames, temporal ensembling compensates for occasional errors in individual frame predictions. Furthermore, trajectory-level aggregation, often utilized in model ensembling, is insufficient for temporal ensembling due to a lack of consideration of traffic context and its tendency to assign candidate trajectories with incorrect driving behaviors to final predictions. We further emphasize the necessity of learning-based aggregation by utilizing mode queries within a DETR-like architecture for our temporal ensembling, leveraging the characteristics of predictions from nearby frames. Our method, validated on the Argoverse 2 dataset, shows notable improvements: a 4% reduction in minADE, a 5% decrease in minFDE, and a 1.16% reduction in the miss rate compared to the strongest baseline, QCNet, highlighting its efficacy and potential in autonomous driving.
摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
誌 謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
1 Introduction 1
2 Related Works 4
2.1 Multi-modal Motion Prediction . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Trajectories Ensembling Techniques . . . . . . . . . . . . . . . . . . . . . . 5
3 Proposed Method 7
3.1 Base Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.1 Input Output Formulation . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.2 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.3 Decoder - DETR-like Architecture . . . . . . . . . . . . . . . . . . 8
3.1.4 Training Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Temporal Ensembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Naive Approach with Trajectory-level Aggregation . . . . . . . . . 9
3.2.2 Learning-based Aggregation . . . . . . . . . . . . . . . . . . . . . . 11
3.2.3 Two Operations within Learning-based Aggregation . . . . . . . . . 13
3.2.4 Training Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Experimental Results 15
4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Dataset and Streaming-Style Formulation . . . . . . . . . . . . . . 16
4.1.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Analysis of Computational Overhead . . . . . . . . . . . . . . . . . . . . . 20
4.4 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.5.1 Ablation Study I - Different Ensembling and Aggregation . . . . . 27
4.5.2 Ablation Study II - Alternative Base Model with DETR-like Decoder 28
4.5.3 Ablation Study III - The effect of increasing the number of mode queries to expand the pool of prediction candidates. . . . . . . . . . 29
5 Conclusion and Future Work 31
References 32
[1] Dean A Pomerleau. “Alvinn: An autonomous land vehicle in a neural network”. In:
Advances in neural information processing systems 1 (1988).
[2] Stéphanie Lefèvre, Dizan Vasquez, and Christian Laugier. “A survey on motion
prediction and risk assessment for intelligent vehicles”. In: ROBOMECH journal
1.1 (2014), pp. 1–14.
[3] Wei Zhan, Arnaud de La Fortelle, Yi-Ting Chen, Ching-Yao Chan, and Masayoshi
Tomizuka. “Probabilistic prediction from planning perspective: Problem formula-
tion, representation simplification and evaluation metric”. In: 2018 IEEE intelligent
vehicles symposium (IV). IEEE. 2018, pp. 1150–1156.
[4] Jaume Barceló et al. Fundamentals of traffic simulation. Vol. 145. Springer, 2010.
[5] Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen, Dragomir Anguelov, Congcong Li, and
Cordelia Schmid. “Vectornet: Encoding hd maps and agent dynamics from vector-
ized representation”. In: Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. 2020, pp. 11525–11533.
[6] Ming Liang, Bin Yang, Rui Hu, Yun Chen, Renjie Liao, Song Feng, and Raquel Ur-
tasun. “Learning lane graph representations for motion forecasting”. In: ComputerVision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020,
Proceedings, Part II 16. Springer. 2020, pp. 541–556.
[7] Jiquan Ngiam, Benjamin Caine, Vijay Vasudevan, Zhengdong Zhang, Hao-Tien
Lewis Chiang, Jeffrey Ling, Rebecca Roelofs, Alex Bewley, Chenxi Liu, Ashish
Venugopal, et al. “Scene Transformer: A unified architecture for predicting multiple
agent trajectories”. In: arXiv preprint arXiv:2106.08417 (2021).
[8] Balakrishnan Varadarajan, Ahmed Hefny, Avikalp Srivastava, Khaled S Refaat,
Nigamaa Nayakanti, Andre Cornman, Kan Chen, Bertrand Douillard, Chi Pang
Lam, Dragomir Anguelov, et al. “Multipath++: Efficient information fusion and
trajectory aggregation for behavior prediction”. In: 2022 International Conference
on Robotics and Automation (ICRA). IEEE. 2022, pp. 7814–7821.
[9] Zikang Zhou, Luyao Ye, Jianping Wang, Kui Wu, and Kejie Lu. “Hivt: Hierar-
chical vector transformer for multi-agent motion prediction”. In: Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022,
pp. 8823–8833.
[10] Zikang Zhou, Jianping Wang, Yung-Hui Li, and Yu-Kai Huang. “Query-Centric
Trajectory Prediction”. In: Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. 2023, pp. 17863–17873.
[11] Xishun Wang, Tong Su, Fang Da, and Xiaodong Yang. “ProphNet: Efficient agent-
centric motion forecasting with anchor-informed proposals”. In: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, pp. 21995–
22003.
[12] Shaoshuai Shi, Li Jiang, Dengxin Dai, and Bernt Schiele. “Motion transformer with
global intention localization and local movement refinement”. In: Advances in Neural
Information Processing Systems 35 (2022), pp. 6531–6543.
[13] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander
Kirillov, and Sergey Zagoruyko. “End-to-end object detection with transformers”.
In: European conference on computer vision. Springer. 2020, pp. 213–229.
[14] Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjeet Singh,
Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kae-
semodel Pontes, Deva Ramanan, Peter Carr, and James Hays. “Argoverse 2: Next
Generation Datasets for Self-Driving Perception and Forecasting”. In: Thirty-fifth
Conference on Neural Information Processing Systems Datasets and Benchmarks
Track (Round 2). 2021. url: https://openreview.net/forum?id=vKQGe36av4k.
[15] Alexander Barth and Uwe Franke. “Where will the oncoming vehicle be the next
second?” In: 2008 IEEE Intelligent Vehicles Symposium. IEEE. 2008, pp. 1068–
1073.
[16] Sajjad Mozaffari, Omar Y Al-Jarrah, Mehrdad Dianati, Paul Jennings, and Alexan-
dros Mouzakitis. “Deep learning-based vehicle behavior prediction for autonomous
driving applications: A review”. In: IEEE Transactions on Intelligent Transportation
Systems 23.1 (2020), pp. 33–47.
[17] Yanjun Huang, Jiatong Du, Ziru Yang, Zewei Zhou, Lin Zhang, and Hong Chen.
“A survey on trajectory-prediction methods for autonomous driving”. In: IEEE
Transactions on Intelligent Vehicles 7.3 (2022), pp. 652–674.
[18] Long Chen, Yuchen Li, Chao Huang, Bai Li, Yang Xing, Daxin Tian, Li Li, Zhongxu
Hu, Xiaoxiang Na, Zixuan Li, et al. “Milestones in Autonomous Driving and Intel-
ligent Vehicles: Survey of Surveys”. In: IEEE Transactions on Intelligent Vehicles
8.2 (2023), pp. 1046–1056.
[19] Henggang Cui, Vladan Radosavljevic, Fang-Chieh Chou, Tsung-Han Lin, Thi Nguyen,
Tzu-Kuo Huang, Jeff Schneider, and Nemanja Djuric. “Multimodal trajectory pre-
dictions for autonomous driving using deep convolutional networks”. In: 2019 Inter-
national Conference on Robotics and Automation (ICRA). IEEE. 2019, pp. 2090–
2096.
[20] ByeoungDo Kim, Seong Hyeon Park, Seokhwan Lee, Elbek Khoshimjonov, Dong-
suk Kum, Junsoo Kim, Jeong Soo Kim, and Jun Won Choi. “Lapred: Lane-aware
prediction of multi-modal future trajectories of dynamic agents”. In: Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021,
pp. 14636–14645.
[21] Zhiyu Huang, Xiaoyu Mo, and Chen Lv. “Multi-modal motion prediction with
transformer-based neural network for autonomous driving”. In: 2022 International
Conference on Robotics and Automation (ICRA). IEEE. 2022, pp. 2605–2611.
[22] Thomas Gilles, Stefano Sabatini, Dzmitry Tsishkou, Bogdan Stanciulescu, and Fa-
bien Moutarde. “Home: Heatmap output for future motion estimation”. In: 2021
IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE.
2021, pp. 500–507.
[23] Hang Zhao, Jiyang Gao, Tian Lan, Chen Sun, Ben Sapp, Balakrishnan Varadara-
jan, Yue Shen, Yi Shen, Yuning Chai, Cordelia Schmid, et al. “Tnt: Target-driven
trajectory prediction”. In: Conference on Robot Learning. PMLR. 2021, pp. 895–
904.
[24] Junru Gu, Chen Sun, and Hang Zhao. “Densetnt: End-to-end trajectory prediction
from dense goal sets”. In: Proceedings of the IEEE/CVF International Conference
on Computer Vision. 2021, pp. 15303–15312.
[25] Lingyao Zhang, Po-Hsun Su, Jerrick Hoang, Galen Clark Haynes, and Micol Marchetti-
Bowick. “Map-adaptive goal-based trajectory prediction”. In: Conference on Robot
Learning. PMLR. 2021, pp. 1371–1383.
[26] Nachiket Deo and Mohan M Trivedi. “Convolutional social pooling for vehicle tra-
jectory prediction”. In: Proceedings of the IEEE conference on computer vision and
pattern recognition workshops. 2018, pp. 1468–1476.
[27] Mohamed Hasan, Evangelos Paschalidis, Albert Solernou, He Wang, Gustav Markkula,
and Richard Romano. “Maneuver-based anchor trajectory hypotheses at round-
abouts”. In: arXiv preprint arXiv:2104.11180 (2021).
[28] Mudasir A Ganaie, Minghui Hu, AK Malik, M Tanveer, and PN Suganthan. “En-
semble deep learning: A review”. In: Engineering Applications of Artificial Intelli-
gence 115 (2022), p. 105151.
[29] Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi
Ramamoorthi, and Ren Ng. “Nerf: Representing scenes as neural radiance fields for
view synthesis”. In: Communications of the ACM 65.1 (2021), pp. 99–106.
[30] Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin
Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng.
“Fourier features let networks learn high frequency functions in low dimensional do-
mains”. In: Advances in Neural Information Processing Systems 33 (2020), pp. 7537–
7547.
[31] Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, and Marco Pavone. “Trajec-
tron++: Dynamically-feasible trajectory forecasting with heterogeneous data”. In:
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August
23–28, 2020, Proceedings, Part XVIII 16. Springer. 2020, pp. 683–700.
[32] Yicheng Liu, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, and Bolei Zhou. “Multi-
modal motion prediction with stacked transformers”. In: Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition. 2021, pp. 7577–7586.
[33] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan
N Gomez, Łukasz Kaiser, and Illia Polosukhin. “Attention is all you need”. In:
Advances in neural information processing systems 30 (2017).
[34] Stefan Lee, Senthil Purushwalkam Shiva Prakash, Michael Cogswell, Viresh Ranjan,
David Crandall, and Dhruv Batra. “Stochastic multiple choice learning for training
diverse deep ensembles”. In: Advances in Neural Information Processing Systems 29
(2016).
[35] Ilya Loshchilov and Frank Hutter. “Decoupled weight decay regularization”. In:
arXiv preprint arXiv:1711.05101 (2017).
[36] Ilya Loshchilov and Frank Hutter. “Sgdr: Stochastic gradient descent with warm
restarts”. In: arXiv preprint arXiv:1608.03983 (2016).
[37] Mingkun Wang, Xinge Zhu, Changqian Yu, Wei Li, Yuexin Ma, Ruochun Jin, Xi-
aoguang Ren, Dongchun Ren, Mingxu Wang, and Wenjing Yang. “Ganet: Goal
area network for motion forecasting”. In: 2023 IEEE International Conference on
Robotics and Automation (ICRA). IEEE. 2023, pp. 1609–1615.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top