跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.136) 您好!臺灣時間:2025/09/20 09:11
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張耿豪
研究生(外文):Keng-Hao Chang
論文名稱:利用關鍵切割估計在遮蔽環境下的物件姿態
論文名稱(外文):6-DoF Object Pose Estimation Using Keypoint-based Segmentation in Occluded Environment
指導教授:于天立
口試委員:吳育任楊茆世芳
口試日期:2019-07-24
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電機工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:53
中文關鍵詞:物件姿態辨識深度圖隨機抽樣一致演算法
DOI:10.6342/NTU201903410
相關次數:
  • 被引用被引用:0
  • 點閱點閱:148
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
基於應用層面的多樣性,例如: 機器人抓取、擴增實境,物件姿態辨識一直以來都是重要的研究領域之一。在實際應用的狀況下,常見的困難有系統速度要求和遮蔽處理。因為深度相機的價格和相關研究的不足,傳統上的物件辨識方法主要著重在沒有深度資訊的彩色圖片上。
隨著深度相機的發展,本論文提出基於彩色圖片和深度資訊的物件姿態辨識方法. 啟發於由下而上的人體姿態辨識方法,我們提出的方法將物件視為多個零件的組合,以此來處理遮蔽問題。除了面對遮蔽問題以外,我們採用了現有的實時二維物件辨識方法來達成快速物件姿態辨識。在姿態估計方面,不同於在霍夫空間中投票的方法,我們直接採用三維空間資訊來估計物件姿態。直接以三維空間來估計物件姿態的優勢是能避免物件的多尺度問題。在論文中,有提出實驗來測試在姿遮情況下的執行時間和表現。
Object pose estimation has been an important research topic due to its various applications such as robot manipulation and augmented reality. In real-world applications, speed requirement and occlusion handling are two difficulties often addressed. Due to the high price of depth cameras and the lack of research on depth cameras, traditional pose estimation methods emphasize using color images without depth information.
With the development of depth cameras, this thesis proposes a 6-DoF object pose estimation method using RGB-D images. Inspired by bottom-up human pose estimation approaches, we proposed a method which considers objects as a collection of components in order to deal with occlusion. In addition to occlusion handling, we apply real-time 2D object recognition method in our method to achieve fast pose estimation. Unlike methods using voting schemes in Hough space, we estimation pose use 3D information directly. The advantage of using 3D information directly is to avoid the multiscale problem for pose estimation. Some experiments are used to test the performance and runtime of the proposed method in occluded environment.
誌謝v
摘要vii
Abstract ix
1 Introduction 1
2 Background 5
2.1 6-DoF Object Pose in 3D Space . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Bottom-up Human Pose Estimation Approaches . . . . . . . . . . . . . . 6
2.3 YOLO Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Related Work 9
3.1 Template-based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Descriptor Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Hough Forest Voting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 End-to-End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Object Coordinate Method . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Proposed Method 17
4.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Keypoint-based Segmentation . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.1 Harris 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.2 Mean Shift Clustering . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.3 Component Localization . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Backward Projection from 2D into 3D Space . . . . . . . . . . . . . . . 22
4.3.1 Direct Backward Projection . . . . . . . . . . . . . . . . . . . . 23
4.3.2 VFH Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Pose Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4.1 RANSAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4.2 Pose Estimation with SVD . . . . . . . . . . . . . . . . . . . . . 27
4.4.3 Pose Estimation using RANSAC with SVD . . . . . . . . . . . . 29
5 Experiment 31
5.1 Dataset and Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.1 Average Distance of Model Points (ADD) . . . . . . . . . . . . . 33
5.2.2 Translational and Rotational Error (TR error) . . . . . . . . . . . 34
5.3 Experimental Results of Pose Estimation . . . . . . . . . . . . . . . . . . 34
5.4 Experiment of Component Matching . . . . . . . . . . . . . . . . . . . . 38
5.4.1 Component Matching with Triangle . . . . . . . . . . . . . . . . 38
5.4.2 Component Matching with Dragon Model . . . . . . . . . . . . . 44
6 Conclusion 49
Bibliography 51
E. Brachmann, A. Krull, F. Michel, S. Gumhold, J. Shotton, and C. Rother. Learning 6d object pose estimation using 3d object coordinates. In European conference on computer vision, pages 536–551. Springer, 2014.
E. Brachmann, F. Michel, A. Krull, M. Ying Yang, S. Gumhold, et al. Uncertaintydriven 6d pose estimation of objects and scenes from a single rgb image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3364–3372, 2016.
B. Calli, A. Singh, J. Bruce, A. Walsman, K. Konolige, S. Srinivasa, P. Abbeel, and A. M. Dollar. Yale-cmu-berkeley dataset for robotic manipulation research. The International Journal of Robotics Research, 36(3):261–268, 2017.
Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7291–7299, 2017.
Y. Chen and G. Medioni. Object modelling by registration of multiple range images. Image and vision computing, 10(3):145–155, 1992.
B. Curless and M. Levoy. A volumetric method for building complex models from range images. 1996.
A. Doumanoglou, R. Kouskouridas, S. Malassiotis, and T.-K. Kim. Recovering 6d object pose and predicting next-best-view in the crowd. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3583–3592, 2016.
B. Drost, M. Ulrich, N. Navab, and S. Ilic. Model globally, match locally: Efficient and robust 3d object recognition. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 998–1005. Ieee, 2010.
M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
S. Hinterstoisser, V. Lepetit, S. Ilic, S. Holzer, G. Bradski, K. Konolige, and N. Navab. Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In Asian conference on computer vision, pages 548–562. Springer, 2012.
Y. Ke, R. Sukthankar, et al. Pca-sift: A more distinctive representation for local image descriptors. CVPR (2), 4:506–513, 2004.
D. Lowe. Distinctive image features from scale-invariant keypoints, cascade filtering approach. IJCV, 2004.
M. Rad and V. Lepetit. Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In Proceedings of the IEEE International Conference on Computer Vision, pages 3828–3836, 2017.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
R. B. Rusu, G. Bradski, R. Thibaux, and J. Hsu. Fast 3d recognition and pose using the viewpoint feature histogram. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2155–2162. IEEE, 2010.
J. Shotton, B. Glocker, C. Zach, S. Izadi, A. Criminisi, and A. Fitzgibbon. Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2930–2937, 2013.
I. Sipiran and B. Bustos. Harris 3d: a robust extension of the harris operator for interest point detection on 3d meshes. The Visual Computer, 27(11):963, 2011.
A. Tejani, R. Kouskouridas, A. Doumanoglou, D. Tang, and T.-K. Kim. Latent-class hough forests for 6 dof object pose estimation. IEEE transactions on pattern analysis and machine intelligence, 40(1):119–132, 2017.
C. Wang, D. Xu, Y. Zhu, R. Martín-Martín, C. Lu, L. Fei-Fei, and S. Savarese. Densefusion: 6d object pose estimation by iterative dense fusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3343–3352, 2019.
Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199, 2017.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top