跳到主要內容

臺灣博碩士論文加值系統

(44.200.169.3) 您好!臺灣時間:2022/12/04 10:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳子翾
研究生(外文):Chen, Tzu-Hsuan
論文名稱:用於光達點雲之即時尺度感知分割
論文名稱(外文):ScaleSeg: Real Time Scale-Aware Segmentation of 3D LiDAR Point Cloud
指導教授:張添烜桑梓賢
指導教授(外文):Chang, Tian-SheuanSang, Tzu-Hsien
口試委員:郭峻因高肇宏
口試委員(外文):Guo, Jiun-InKao, Jau-Hong
口試日期:2019-10-17
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電子研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:英文
論文頁數:51
中文關鍵詞:語意分割物體分割光達點雲
外文關鍵詞:Semantic SegmentationInstance SegmentationLiDAR Point Cloud
相關次數:
  • 被引用被引用:0
  • 點閱點閱:289
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 MV3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 VoxelNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.3 PIXOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.4 Frustum PointNets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Semantic Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 SqueezeSeg and SqueezeSegV2 . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 PointSeg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Real Time ScaleAware
Semantic Segmentation on Point Cloud . . . . . . . . . . 8
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Range Image Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 LiDAR Point Cloud Acquisition . . . . . . . . . . . . . . . . . . . . . 9
3.2.2 Range Image Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Data Preprocessing
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 Semantic Mask Generation . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Don’t Care Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
vi
3.3.3 Normalization and Augmentation . . . . . . . . . . . . . . . . . . . . 12
3.4 Scale Aware Semantic Segmentation Framework . . . . . . . . . . . . . . . . 13
3.4.1 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.2 Proposed Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.1 Cross Entropy Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.2 Lovász Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.3 Scaleaware
Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6 SuperConvergence
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Instance Level Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 DensityBased
Spatial Clustering Applications with Noise . . . . . . . . . . . 24
4.3 Distance Function for LiDAR Point Cloud . . . . . . . . . . . . . . . . . . . . 25
5 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1 Details of Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1.1 Backbone encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1.2 Light Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.3 Heavy Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.4 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.5 Training Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 KITTI 3D Object Detection Dataset . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.1 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.2 Results Comparison of the Upper and the Lower Part . . . . . . . . . . 33
5.2.3 Results of FPS, FLOPs and the Number of Parameters . . . . . . . . . 33
5.2.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2.5 Instance Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 KITTI Raw Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3.1 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.4 Performance of Inference on Jetson AGX Xavier . . . . . . . . . . . . . . . . 39
vii
5.5 Synthetic Foggy KITTI Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5.2 Directly Inference on Foggy LiDAR Point Cloud . . . . . . . . . . . . 41
5.5.3 Augmentation with Foggy LiDAR Point Cloud . . . . . . . . . . . . . 44
5.5.4 Instance Segmentation on Foggy LiDAR Point Cloud . . . . . . . . . . 45
6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional
neural networks,” in Advances in neural information processing systems, 2012,
pp. 1097–1105.
[2] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy,
A. Khosla, M. Bernstein, A. C. Berg, and L. FeiFei,
“ImageNet Large Scale Visual Recognition
Challenge,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp.
211–252, 2015.
[3] Y. Xie, J. Tian, and X. X. Zhu, “A review of point cloud semantic segmentation,” arXiv
preprint arXiv:1908.08854, 2019.
[4] K. Mo, S. Zhu, A. X. Chang, L. Yi, S. Tripathi, L. J. Guibas, and H. Su, “PartNet: A largescale
benchmark for finegrained
and hierarchical partlevel
3D object understanding,” in
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
[5] I. Armeni, A. Sax, A. R. Zamir, and S. Savarese, “Joint 2D3DSemantic
Data for Indoor
Scene Understanding,” ArXiv eprints,
Feb. 2017.
[6] X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multiview
3d object detection network for
autonomous driving,” in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, 2017, pp. 1907–1915.
[7] Y. Zhou and O. Tuzel, “Voxelnet: Endtoend
learning for point cloud based 3d object detection,”
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
2018, pp. 4490–4499.
[8] B. Yang, W. Luo, and R. Urtasun, “Pixor: Realtime
3d object detection from point clouds,”
in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018,
pp. 7652–7660.
48
[9] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum pointnets for 3d object detection
from rgbd
data,” in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, 2018, pp. 918–927.
[10] B. Wu, A. Wan, X. Yue, and K. Keutzer, “Squeezeseg: Convolutional neural nets with
recurrent crf for realtime
roadobject
segmentation from 3d lidar point cloud,” in 2018
IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp.
1887–1893.
[11] B. Wu, X. Zhou, S. Zhao, X. Yue, and K. Keutzer, “Squeezesegv2: Improved model structure
and unsupervised domain adaptation for roadobject
segmentation from a lidar point
cloud,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE,
2019, pp. 4376–4382.
[12] L.C.
Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic
image segmentation with deep convolutional nets, atrous convolution, and fully connected
crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp.
834–848, 2017.
[13] Y. Wang, T. Shi, P. Yun, L. Tai, and M. Liu, “Pointseg: Realtime
semantic segmentation
based on 3d lidar point cloud,” arXiv preprint arXiv:1807.06288, 2018.
[14] J. Hu, L. Shen, and G. Sun, “Squeezeandexcitation
networks,” in Proceedings of the
IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141.
[15] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision
benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR),
2012.
[16] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,”
International Journal of Robotics Research (IJRR), 2013.
[17] O. Ronneberger, P. Fischer, and T. Brox, “Unet:
Convolutional networks for biomedical
image segmentation,” in International Conference on Medical image computing and
computerassisted
intervention. Springer, 2015, pp. 234–241.
49
[18] F. Yu, D. Wang, E. Shelhamer, and T. Darrell, “Deep layer aggregation,” in Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2403–2412.
[19] M. Berman, A. Rannen Triki, and M. B. Blaschko, “The lovászsoftmax
loss: A tractable
surrogate for the optimization of the intersectionoverunion
measure in neural networks,”
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
2018, pp. 4413–4421.
[20] L. N. Smith and N. Topin, “Superconvergence:
Very fast training of neural networks using
large learning rates,” in Artificial Intelligence and Machine Learning for MultiDomain
Operations Applications, vol. 11006. International Society for Optics and Photonics,
2019, p. 1100612.
[21] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask rcnn,”
in Proceedings of the IEEE
international conference on computer vision, 2017, pp. 2961–2969.
[22] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,”
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
2018, pp. 8759–8768.
[23] J. Uhrig, E. Rehder, B. Fröhlich, U. Franke, and T. Brox, “Box2pix: Singleshot
instance
segmentation by assigning pixels to object boxes,” in 2018 IEEE Intelligent Vehicles Symposium
(IV). IEEE, 2018, pp. 292–299.
[24] D. Neven, B. D. Brabandere, M. Proesmans, and L. V. Gool, “Instance segmentation by
jointly optimizing spatial embeddings and clustering bandwidth,” in Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 8837–8845.
[25] D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–
619, May 2002.
[26] M. Ester, H.P.
Kriegel, J. Sander, X. Xu et al., “A densitybased
algorithm for discovering
clusters in large spatial databases with noise.” in Kdd, vol. 96, no. 34, 1996, pp. 226–231.
50
[27] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in
Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp.
770–778.
[28] G. P. Meyer, A. Laddha, E. Kee, C. VallespiGonzalez,
and C. K. Wellington, “Lasernet:
An efficient probabilistic 3d object detector for autonomous driving,” in Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 677–12 686.
[29] T.Y.
Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,”
in Proceedings of the IEEE international conference on computer vision, 2017, pp.
2980–2988.
[30] P. Biasutti, A. Bugeau, J.F.
Aujol, and M. Brédif, “Riunet:
Embarrassingly simple semantic
segmentation of 3d lidar point cloud,” arXiv preprint arXiv:1905.08748, 2019.
[31] S.Y.
Tsai, “Contermeasures and performance of lidar in foggy environments,” Master
Thesis, National Chiao Tung University, Taiwan, 2019.
[32] C. Sakaridis, D. Dai, and L. Van Gool, “Semantic foggy scene understanding with
synthetic data,” International Journal of Computer Vision, vol. 126, no. 9, pp. 973–992,
Sep 2018. [Online]. Available: https://doi.org/10.1007/s1126301810728
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top