跳到主要內容

臺灣博碩士論文加值系統

(44.200.101.84) 您好!臺灣時間:2023/10/05 10:03
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:連浩之
研究生(外文):LIEN, HAU-ZH
論文名稱:基於 3D 光達點雲與 RGB 影像感測整合之雙向傳播框架的 3D 目標檢測技術
論文名稱(外文):Improving Bidirectional Fusion Of 3D Lidar Point Clouds And RGB Image Sensing For 3D Target Detection
指導教授:賴文能賴文能引用關係
指導教授(外文):LIE, WEN-NUNG
口試委員:謝奇文林國祥
口試委員(外文):HSIEH, CHI-WENLIN,GUO-SHIANG
口試日期:2023-07-31
學位類別:碩士
校院名稱:國立中正大學
系所名稱:電機工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:中文
論文頁數:51
中文關鍵詞:三維物體檢測電腦視覺點雲深度學習多感測器
外文關鍵詞:3D Object DetectionComputer VisionPoint CloudDeep LearningMulti-Sensors
相關次數:
  • 被引用被引用:0
  • 點閱點閱:29
  • 評分評分:
  • 下載下載:4
  • 收藏至我的研究室書目清單書目收藏:0
三維物體檢測與定位技術是當前電腦視覺領域的核心研究議題,其能夠協助機器分析三維場景,並有助於各種應用,如自動駕駛系統、輔助駕駛系統,以及機器人視覺等。然而,在三維環境中,僅使用二維訊息是無法滿足三維物體檢測的需求的,因此,隨著自駕車相關研究的快速發展,三維物體檢測也成為了近年來的熱門研究焦點。
當前,三維物體檢測的主要研究路徑可大致分為三種:基於影像(主要使用雙目影像)、基於3D光達點雲數據,以及基於影像和光達的混合方法。其中,基於光達點雲數據的方法發展最為迅速,且性能提升顯著。
然而,考慮到實際的自駕車系統通常會搭載多種不同的感測器,因此在這篇論文中,我們提出一種新的三維物體檢測和定位方法。該方法基於單眼RGB影像和3D光達感測數據,通過融合影像特徵和光達的空間特徵,並利用深度學習技術,旨在提升三維物體檢測的準確性和效率。
本論文提出基於 3D 光達點雲與 RGB 影像感測整合之雙向傳播框架的 3D 目標檢測技術,該架構透過KP-Conv及CBAM提升2維及三維特徵的表示,再藉由雙向特徵傳播的特性讓2維和3維的特徵有彼此的關聯,最後經由輔助任務強化空間資訊以及後續的物件偵測操作。

3D object detection and localization technology is the core research topic in the current field of computer vision. It can assist machines to analyze 3D scenes and contribute to various applications, such as automatic driving systems, assisted driving systems, and robot vision. However, in a 3D environment, only using 2D information cannot meet the needs of 3D object detection. Therefore, with the rapid development of research related to self-driving cars, 3D object detection has become a hot research focus in recent years.
At present, the main research paths of 3D object detection can be roughly divided into three types: image-based (mainly using binocular images), based on 3D lidar point cloud data, and a hybrid method based on image and lidar. Among them, the method based on LiDAR point cloud data has developed the most rapidly, and its performance has improved significantly.
However, considering that practical self-driving car systems are usually equipped with many different sensors, in this paper, we propose a new method for 3D object detection and localization. Based on monocular RGB images and 3D lidar sensing data, the method aims to improve the accuracy and efficiency of 3D object detection by fusing image features and lidar spatial features, and using deep learning technology.
This paper proposes a 3D object detection technology based on the two-way propagation framework of the integration of 3D lidar point cloud and RGB image sensing. Features allow 2D and 3D features to be associated with each other, and finally enhance spatial information and subsequent object detection operations through auxiliary tasks.
摘要 I
目錄 III
圖目錄 V
表目錄 VII
第一章 緒論 1
1.1 研究背景與動機 1
1.2 三維物件偵測相關技術 2
1.3 本論文提出架構 3
第二章 相關文獻簡述 4
2.1 基於雙目影像輸入之三維物件偵測 4
2.2 基於點雲輸入之三維物件偵測 6
2.3 基於點雲及影像之三維物件偵測 10
2.4 論文綜整 14
第三章 3D 光達點雲與 RGB 影像感測整合之雙向傳播框架 17
3.1 雙向特徵傳播 (Bidirectional Feature Propagation ) 19
3.2 輔助任務 19
3.3 KPConv (Kernel Point Convolution) 20
3.4 CBAM (Convolutional Block Attention Module) 21
3.5 損失函數 22
第四章 實驗結果與討論 23
4.1 實驗環境 23
4.2 數據集介紹 24
4.3 訓練流程 25
4.4 評估指標 25
4.5 結果與比較 26
第五章 結論與未來工作 35
5.1 結論 35
5.2 未來工作 35
參考文獻 36

[1]S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017, doi: 10.1109/TPAMI.2016.2577031.
[2]P. Li, X. Chen and S. Shen, "Stereo R-CNN Based 3D Object Detection for Autonomous Driving," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 7644-7652, doi: 10.1109/CVPR.2019.00783.
[3]J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao, "Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10548-10557, doi: 10.1109/CVPR42600.2020.01056
[4]Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 652-660, doi: 10.1109/CVPR.2017.16.
[5]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. of the IEEE, pp.2278-2324, Nov. 1998
[6]Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas, "PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space," Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17), 2017, pp. 5105–5114, doi: 10.5555/3295222.3295263.
[7]S. Shi, X. Wang and H. Li, "PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 770-779, doi: 10.1109/CVPR.2019.00086.
[8]Y. Zhou and O. Tuzel, "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4490-4499, doi: 10.1109/CVPR.2018.00472.
[9]Z. Liu, H. Tang, Y. Lin and S. Han, "Point-Voxel CNN for Efficient 3D Deep Learning," Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS'19), 2019, pp. 5105–5114, doi: 10.5555/3454287.3454374.
[10]S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, "PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10529-10538, doi: 10.1109/CVPR42600.2020.01054.
[11]S. Pang, D. Morris and H. Radha, "CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection," 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 10386–10393, doi: 10.1109/IROS45743.2020.9341791.
[12]C. R. Qi, W. Liu, C. Wu, H. Su and L. J. Guibas, "Frustum PointNets for 3D Object Detection From RGB-D Data," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 918-927, doi: 10.1109/CVPR.2018.00102.
[13]Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, Deng Cai, “Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph,” 2022 European Conference on Computer Vision (ECCV), 2022, pp 662-679,
[14]Xiaopei Wu, Liang Peng, Honghui Yang, Liang Xie, Chenxi Huang, Chengqi Deng, Haifeng Liu, Deng Cai, "Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), NewOrleans , LA , USA,2022 ,pp.5408-5417, doi:10.1109/CV PR52688.2022.00534.
[15]A. Mahmoud, J. S. K. Hu and S. L. Waslander, "Dense Voxel Fusion for 3D Object Detection," 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2023, pp. 663-672, doi: 10.1109/WACV56688.2023.00073.
[16]Yanwei Li, Xiaojuan Qi, Yukang Chen, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia, "Voxel Field Fusion for 3D Object Detection," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 1110-1119, doi: 10.1109/CVPR52688.2022.00119.
[17]Yifan Zhang, Qijian Zhang, Junhui Hou, Yixuan Yuan, Guoliang Xing, "Bidirectional Propagation for Cross-Modal 3D Object Detection," arXiv preprint arXiv:2301.09077 (2023).
[18]H. Thomas, C. R. Qi, J. -E. Deschaud, B. Marcotegui, F. Goulette and L. Guibas, "KPConv: Flexible and Deformable Convolution for Point Clouds," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, pp. 6410-6419, doi: 10.1109/ICCV.2019.00651.
[19]Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[20]S. Woo , J. Park , J.-Y. Lee, I. S. Kweon, "CBAM: Convolutional Block Attention Module," 2018 European Conference on Computer Vision (ECCV), 2018, pp. 3-19, doi: 10.48550/ arXiv.1807.06521 (arXiv version).
[21]Chen Chen, Zhe Chen, Jing Zhang, and Dacheng Tao, "Sasa: Semantics-augmented set abstraction for point-based 3d object detection," Proceedings of the AAAI Conference on Artificial Intelligence, volume 1, pp. 221–229, 2022.
[22]Zetong Yang, Yanan Sun, Shu Liu, and Jiaya Jia, " 3dssd: Point-based 3d single stage object detector," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11040–11048, 2020.
[23]A. Geiger, P. Lenz and R. Urtasun, "Are we ready for autonomous driving? The KITTI vision benchmark suite," 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 3354-3361, doi: 10.1109/CVPR.2012.6248074.
[24]D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Proceedings of International Conference on Learning Representations (ICLR Poster), 2015, doi: 10.48550/arXiv.1412.6980 (arXiv version).

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊