( 您好!臺灣時間:2021/05/11 06:44
字體大小: 字級放大   字級縮小   預設字形  


論文名稱(外文):3D Object Model Aided RGBD-CNN Object Orientation Justification and Convolutional Autoencoder Grasping Points Generation Method
指導教授(外文):Tzuu-Hseng S. Li
外文關鍵詞:3D KD-TreeConvolutional AutoencoderRGBD-CNN
  • 被引用被引用:0
  • 點閱點閱:34
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
A 3D object grasping point learning system is proposed in this thesis, which contains object coordinate construction and grasping point learning. For constructing the object coordinate, the pose of the object needs to be estimated. A RGBD Convolutional Neural Network (RGBD-CNN) is proposed to classify the orientation type of the object. An object model and the iterative closest point algorithm (ICP) are then applied to estimate the object pose. Hence, the object coordinate can be constructed in the end. For learning object grasping region, the normal vector images and depth image of the object are obtained first. Then, the grasping range of the end effector (palm) will be simulated on these images. Finally, Convolutional Autoencoder (CAE) is applied to encode physical characteristics of simulated palm image. By comparing the features of simulated palm in the database through 3D KD-tree, the proposed method can evaluate the grasping points. Through integrating object coordinate and learnt grasping point, the robot plans a suitable grasping point based on the appointed task. It is worth mentioning that most of the researches only put emphasis on either object orientation justification or grasp points generation. However, this research considers the problem of object pose estimation and grasping points generation together. Therefore, the robot can real-time adapt to different task situations. The first experiment shows that the robot is able to understand the spatial relationship between each object by object coordinate system. In the second experiment, the robot successfully puts the object from random pose to the assigned pose.
摘要 I
Abstract II
Acknowledgement III
Contents IV
List of Tables VI
List of Figures VII
List of Variables XI
Chapter 1 1
1.1 Motivation 1
1.2 Related Work 2
1.2.1 Object Orientation Justification 3
1.2.2 Grasping Points Learning 5
1.3 System Overview 6
1.4 Thesis Organization 7
Chapter 2 9
2.1 Introduction 9
2.2 Feature Extraction and Pre-processing 10
2.2.1 Modeling Platform 10
2.2.2 Feature Extraction 12
2.3 Points Cloud Concatenation and model result 13
Chapter 3 15
3.1 Introduction 15
3.2 Image Pre-Processing 17
3.3 RGBD-CNN Structure and Training 20
3.3.1 Training Premise 20
3.3.2 RGBD-CNN Structure 22
3.4 ICP Based Compensation 27
3.4.1 Model Occlusion 28
3.4.2 Iterative Closest Point (ICP) 30
3.5 Simulations 36
Chapter 4 38
4.1 Introduction 38
4.2 Image Pre-Processing 41
4.3 CAE Training Data Generation and Testing Data Detection 43
4.3.1 Stage I 43
4.3.2 Stage II 44
4.4 Autoencoder Structure and Training 54
4.5 Evaluation and Results 60
Chapter 5 65
5.1 Introduction 65
5.2 Experiment I 68
5.2.1 Experiment I-I 70
5.2.2 Experiment I-II 76
5.3 Experiment II 81
Chapter 6 89
6.1 Conclusion 89
6.2 Future Works & Discussion 90
References 92
Appendix 97
[1]R. Girshick, “Fast R-CNN, in Proceedings of IEEE International Conference on Computer Vision, pp 1440-1448, Santiago, 2015.
[2]S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, IEEE transactions on pattern analysis and machine intelligence, 2017, 39, (6), pp. 1137–1149.
[3]K. He, G. Gkioxari, P. Dollr and R. Girshick, “Mask R-CNN, in Proceedings of IEEE International Conference on Computer Vision, pp. 2961-2969, Oct 2017.
[4]J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You Only Look Once: Unified Real-time Object Detection, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
[5]J. Redmon and A. Farhadi, “YOLO9000: Better Faster Stronger, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263-7271, 2017.
[6]Y. Guo, M. Bennamoun, F. Sohel, M. Lu and J. Wan, “An Integrated Framework for 3-D Modeling Object Detection and Pose Estimation From Point-Clouds, IEEE Transactions on Instrumentation and Measurement, vol. 64, no. 3, pp. 683-693, March 2015.
[7]C.-Y. Tsai and S.-H. Tsai, “Simultaneous 3D Object Recognition and Pose Estimation Based on RGB-D Images, IEEE Access, vol. 6, pp. 28859-28869, 2018.
[8]Z. Teng and J. Xiao, “Surface-based Detection and 6-Dof Pose Estimation of 3-D Objects in Cluttered Scenes, IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1347-1361, 2016.
[9]V. Lepetit, J. Pilet and P. Fua, “Point Matching as a Classification Problem for Fast and Robust Object Pose Estimation, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, 2004.
[10]A. Tejani, R. Kouskouridas, A. Doumanoglou, D. Tang and T. K. Kim, “Latent-Class Hough Forests for 6 DoF Object Pose Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 1, pp. 119-132, 2018.
[11]M. J. Landau and P. A. Beling, “Optimal Model-Based 6-D Object Pose Estimation with Structured-light Depth Sensors, IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 58-73, 2017.
[12]A. Nigam, A. Penate-Sanchez and L. Agapito, “Detect Globally, Label Locally: Learning Accurate 6-DOF Object Pose Estimation by Joint Segmentation and Coordinate Regression, IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3960-3967, 2018.
[13]P. Wohlhart and V. Lepetit, “Learning Descriptors for Object Recognition and 3D Pose Estimation, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3109-3118, 2015.
[14]M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean and M. Kudlur, “Tensorflow: A System for Large-scale Machine Learning, in Proceedings of Symposium on Operating Systems Design and Implementation, pp. 265-283, 2016.
[15]Keras. [Online]. Available: https://keras.io/
[16]Morvanzhou. [Online]. Available: https://morvanzhou.github.io/
[17]Y. Xiang, T. Schmidt, V. Narayanan and D. Fox, “PoseCNN: A Convolutional Neural Network for 6d Object Pose Estimation in Cluttered Scenes, arXiv preprint arXiv:1711.00199v3, 2018.
[18]Y. Liu, L. Zhou, H. Zong, X. Gong, Q. Wu, Q. Liang and J. Wang, “Regression-based 3D Pose Estimation for Texture-less Objects, IEEE Transactions on Multimedia (Early Access), DOI:10.1109/TMM.2019.2913321, 2019.
[19]M. Schwarz, H. Schulz, S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features, Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Seattle, May 2015.
[20]A. Kanezaki, Y. Matsushita, Y. Nishida et al., “RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 5010-5019, 2018.
[21]C. M. Lin, C. Y. Tsai, Y. C. Lai, S. A. Li, and C. C. Wong, ‘‘Visual Object Recognition and Pose Estimation Based on a Deep Semantic Segmentation Network,’’ IEEE Sensors Journal, vol. 18, no. 22, pp. 9370-9381, 2018.
[22]A. Ten Pas, M. Gualtieri, K. Saenko and R. Platt, ‘‘Grasp Pose Detection in Point Clouds,’’ The International Journal of Robotics Research, vol. 36, no. 13-14, pp. 1455-1473,2017.
[23]W.-H. Yen, “Enhanced Grey Wolf Optimizer based Multiple Object Grasping Poses for Home Service Robot, Dept. of Electrical Eng., National Cheng Kung University, Tainan, Taiwan, R.O.C., 2016.
[24]C. Liu, B. Fang, F. Sun, X. Li and W. Huang, “Learning to Grasp Familiar Objects Based on Experience and Objects’ Shape Affordance, IEEE Transactions on Systems, Man, and Cybernetics: Systems (Early Access), DOI: 10.1109/TSMC.2019.2901955, 2019.
[25]L. Li, W. Wang, Y. Su and Z. Du, “A Data-driven Grasp Planning Method Based on Gaussian Process Classifier, in Proceedings of IEEE International Conference on Mechatronics and Automation , pp. 2626-2631, Beijing, China, 2015.
[26]H. Zhang, X. Zhou, X. Lan, J. Li, Z. Tian and N. Zheng, “A Real-Time Robotic Grasping Approach With Oriented Anchor Box, arXiv preprint arXiv:1809.03873, 2018.
[27]Z. Xu, Y. Zheng and S. Rawashdeh, “A simple Robotic Fingertip Sensor Using Imaging and Shallow Neural Networks, IEEE Sensors Journal (accepted), 2019.
[28]F. H. Zunjani, S. Sen, H. Shekhar, A. Powale, D. Godnaik, and G. C. Nandi, “Intent-based Object Grasping by a Robot Using Deep Learning, in Proceedings of IEEE International Advance Computing Conference, pp. 246-251, 2018.
[29]Y. H. Na, H. Jo and J. B. Song, “Learning to Grasp Objects Based on Ensemble Learning Combining Simulation Data and Real Data, in Proceedings of IEEE International Conference on Control, Automation and Systems, pp. 1030-1034, 2017.
[30]F. Sun, C. Liu, W. Huang and J. Zhang, “Object Classification and Grasp Planning Using Visual and Tactile Sensing, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 46, no. 7, pp. 969-979, 2016.
[31]U. Asif, M. Bennamoun and F. A. Sohel, “RGB-D Object Recognition and Grasp Detection Using Hierarchical Cascaded Forests, IEEE Transactions on Robotics, vol. 33, no. 3, pp. 547-564, 2017.
[32]I. Lenz, H. Lee and A. Saxena, “Deep Learning for Detecting Robotic Grasps, The International Journal of Robotics Research, vol. 34, no. 4-5, pp. 705-724, 2015.
[33]C. Choi, W. Schwarting, J. DelPreto and D. Rus, “Learning Object Grasping for Soft Robot Hands, IEEE Robotics and Automation Letter, vol. 3, no. 3, pp. 2370-2377, 2018.
[34]Z. Han, Z. Liu, J. Han, C. M. Vong, S. Bu and X. Li, “Unsupervised 3D Local Feature Learning by Circle Convolutional Restricted Boltzmann Machine, IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5331-5344, 2016.
[35]Point Cloud Library (PCL). [Online]. Available: http://pointclouds.org/
[36]KD-Tree. [Online]. Available: http://pointclouds.org/documentation/tutorials/kdtree_search.php
[37]Intel D435i. [Online]. Available: https://www.intelrealsense.com/depth-camera-d435i/
[38]MeshLab. [Online]. Available: http://www.meshlab.net/
[39]Opencv. [Online]. Available: https://opencv.org/
[40]D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization, arXiv preprint arXiv:1412.6980v9, 2017.
[41]P. J. Besl and N.D. McKay, “A Method for Registration of 3-D shapes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239-256, 1992.
[42]E. Shireen, A. Farag and A. Farag (2009). Iterative Closed Point: A Tutorial on Rigid Registration. [Online]. Available: http://www.sci.utah.edu/~shireen/pdfs/tutorials/Elhabian_ICP09.pdf
[43]R. B. Rusu, “Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments, KI-Künstliche Intelligenz, vol. 24, no. 4, pp. 345-348, 2010.
[44]ROS. [Online]. Available: https://www.ros.org/
[45]ROBOTIS. [Online]. Available: http://www.robotis.us/
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔