跳到主要內容

臺灣博碩士論文加值系統

(44.212.96.86) 您好!臺灣時間:2023/12/10 08:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳柏維
研究生(外文):Bo-Wei Chen
論文名稱:惡劣天氣條件下基於關聯注意力機制融合雷達和光達進行物件偵測
論文名稱(外文):Fusion of Radar and LiDAR Using Associative Mechanism for Object Detection in Adverse Weather Conditions
指導教授:李明穗
指導教授(外文):Ming-Sui Lee
口試委員:葉梅珍李界羲
口試委員(外文):Mei-Chen YehLi-Jie Xi
口試日期:2023-07-31
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊網路與多媒體研究所
學門:電算機學門
學類:網路學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
論文頁數:36
中文關鍵詞:深度學習多模態物件偵測基於注意力機制進行特徵融合
外文關鍵詞:deep learningmultimodal object detectionfeature fusion based on attention mechanism
DOI:10.6342/NTU202302360
相關次數:
  • 被引用被引用:0
  • 點閱點閱:19
  • 評分評分:
  • 下載下載:4
  • 收藏至我的研究室書目清單書目收藏:0
隨著深度學習技術的不斷發展,物件偵測的準確性也日益提高。自動駕駛 Level 5 的實現已經近在眼前。在良好的天氣條件下,物件偵測的平均精確度可以 高達百分之八十五以上。然而,天氣並非時時都理想,有時候會下雨、起霧,甚 至下雪,這種惡劣天氣會大幅降低物件偵測的準確性。 傳統的感測器,如攝像頭和LiDAR,都容易受到惡劣天氣的影響。因此,我 們採用RADAR和LiDAR的融合來進行物件偵測。RADAR在惡劣環境下不受影響,但會產生許多噪點雲。因此,我們需要使用LiDAR作為輔助,因為LiDAR 能提供精確的環境點雲信息,有助於減少虛擬偵測。我們使用注意力機制來融合LiDAR和RADAR的特徵。同時,我們提出了特 徵選取模塊(Feature Selection Module),解決了注意力機制中關注權重的問題。 此外,我們還提出了關聯融合模塊(Associative Feature Fusion Module),充分利 用注意力機制選取的特徵。通過實驗證明,我們提出的模型優於目前最先進的 RADAR和LiDAR模型。
With the continuous development of deep learning technology, the accuracy of object detection has been steadily improving. The realization of Level 5 autonomous driving is within reach. In favorable weather conditions, the average accuracy of object detection can reach over 85 percent. However, the weather is not always ideal, and conditions such as rain, fog, and even snow can significantly reduce the accuracy of object detection. Traditional sensors like cameras and LiDAR are susceptible to the influence of harsh weather conditions. Therefore, we adopt a fusion of RADAR and LiDAR for object de tection. RADAR is unaffected by adverse environmental conditions but introduces a lot of noisy point clouds. Hence, we utilize LiDAR as an auxiliary sensor because it provides accurate environmental point cloud information, which helps mitigate ghost detection. We employ an attention mechanism to fuse the features from LiDAR and RADAR. Additionally, we propose a Feature Selection Module to address the issue of attention weights in the attention mechanism. Furthermore, we introduce an Associative Feature Fusion Module to fully utilize the selected features from the attention mechanism. Through experiments, we demonstrate that our proposed model outperforms the state-of-the-art RADAR and LiDAR models.
Verification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents vi
List of Figures viii
List of Tables x
Chapter 1 Introduction 1
Chapter 2 Related Work 6
2.1bUnimodal Sensor Detection 6
2.1.1 Radar-Only Dtection 7
2.1.2 LiDAR-Only Detection 8
2.2 Multimodal Sensor Fusion Detection 9
2.2.1 Lidar and Camera Fusion Detection 10
2.2.2 Lidar and Radar Fusion Detection 10
Chapter 3 Method 12
3.1Framework Overview 13
3.2Feature Selection Module 14
3.2.1Feature Spatial Selection 14
3.2.2Feature Channel Selection 14
3.3Associative Feature Fusion Module 15
Chapter4 Experiments 17
4.1Dataset Description 17
4.2Implement Detail 18
4.2.1Evaluation Metrics 18
4.3Comparison with Only Radar 19
4.4Comparison with Only LiDAR 19
4.5Comparison with Radar and LiDAR 20
4.6Ablation Study 21
4.6.1Comparison with Other Feature Fusion Operation 21 4.6.2Ablation of Model Components 22
4.7Discussion 23
Chapter5 Conclusion 29
References 31
X.Bai, Z.Hu,X.Zhu,Q.Huang,Y.Chen,H.Fu,andC.-L.Tai. Transfusion: Robust lidar-camera fusion for 3d object detection with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1090 1099, 2022.
D.Barnes,M.Gadd,P.Murcutt,P.Newman,andI.Posner. Theoxfordradarrobotcar dataset: A radar extension to the oxford robotcar dataset. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 6433–6438. IEEE, 2020.
A. Barrera, C. Guindel, J. Beltrán, and F. García. Birdnet+: End-to-end 3d object detection in lidar bird's eye view. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pages 1–6. IEEE, 2020.
H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom. nuscenes: A multimodal dataset for autonomous driv ing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020.
S.-Y. Chu and M.-S. Lee. Mt-detr: Robust end-to-end multimodal detection with confidence fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5252–5261, 2023.
O.-R. A. D. O. Committee. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles, sep 2016.
L. Dai, H. Liu, H. Tang, Z. Wu, and P. Song. Ao2-detr: Arbitrary-oriented ob ject detection transformer. IEEE Transactions on Circuits and Systems for Video Technology, 2022.
Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021.
R. Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander. Joint 3d proposal gen eration and object detection from view aggregation. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1–8. IEEE, 2018.
A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12697–12705, 2019.
P. Li, P. Wang, K. Berntorp, and H. Liu. Exploiting temporal relations on radar perception for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17071–17080, 2022.
Y. Li, A. W. Yu, T. Meng, B. Caine, J. Ngiam, D. Peng, J. Shen, Y. Lu, D. Zhou, Q. V. Le, et al. Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17182–17191, 2022.
T. Liang, H. Xie, K. Yu, Z. Xia, Z. Lin, Y. Wang, T. Tang, B. Wang, and Z. Tang. Bevfusion: Asimpleandrobustlidar-camera fusion framework. AdvancesinNeural Information Processing Systems, 35:10421–10434, 2022.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
J. Mao, Y. Xue, M. Niu, H. Bai, J. Feng, X. Liang, H. Xu, and C. Xu. Voxel transformer for 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3164–3173, 2021.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
G. Qian, Y. Li, H. Peng, J. Mai, H. Hammoud, M. Elhoseiny, and B. Ghanem. Point next: Revisiting pointnet++ with improved training and scaling strategies. Advances in Neural Information Processing Systems, 35:23192–23204, 2022.
K. Qian, S. Zhu, X. Zhang, and L. E. Li. Robust multimodal vehicle detection in foggy weather using complementary lidar and radar signals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 444 453, 2021.
C. Reading, A. Harakeh, J. Chae, and S. L. Waslander. Categorical depth distribu tion network for monocular 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8555–8564, 2021.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. InProceedingsoftheIEEEconferenceoncomputervision and pattern recognition, pages 779–788, 2016.
J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
M. Sheeny, E. De Pellegrin, S. Mukherjee, A. Ahrabian, S. Wang, and A. Wallace. Radiate: A radar dataset for automotive perception in bad weather. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 1–7. IEEE, 2021.
S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10529–10538, 2020.
X. Shi, Q. Ye, X. Chen, C. Chen, Z. Chen, and T.-K. Kim. Geometry-based distance decomposition for monocular 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15172–15181, 2021.
P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, et al. Scalability in perception for autonomous driv ing: Waymoopendataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020.
Z. Tian, C. Shen, H. Chen, and T. He. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, 2019.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
T.Wang,X.Zhu,J.Pang,andD.Lin. Fcos3d: Fullyconvolutionalone-stagemonoc ular 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 913–922, 2021.
Y. Wang, V. C. Guizilini, T. Zhang, Y. Wang, H. Zhao, and J. Solomon. Detr3d: 3d object detection from multi-view images via 3d-to-2d queries. In Conference on Robot Learning, pages 180–191. PMLR, 2022.
Z. Wang, W. Zhan, and M. Tomizuka. Fusing bird's eye view lidar point cloud and front view camera image for 3d object detection. In 2018 IEEE intelligent vehicles symposium (IV), pages 1–6. IEEE, 2018.
Z. Yang, Y. Sun, S. Liu, and J. Jia. 3dssd: Point-based 3d single stage object de tector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11040–11048, 2020.
S.Zhang,C.Chi,Y.Yao,Z.Lei,andS.Z.Li. Bridgingthegapbetweenanchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9759 9768, 2020.
Y. Zhou, L. Liu, H. Zhao, M. López-Benítez, L. Yu, and Y. Yue. Towards deep radar perception for autonomous driving: Datasets, methods, and challenges. Sensors, 22(11):4208, 2022.
Y.ZhouandO.Tuzel. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4490–4499, 2018.
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top