[1] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár and C. L. Zitnick, "Microsoft coco: Common objects in context," in Europeon Conference on Computer Vision, Cham, pp. 740-755, 2014.
[2] A. Bicchi and V. Kumar, "Robotic grasping and contact: A review," in IEEE International Conference on Robotics and Automation, vol. 1, pp. 348-353, 2000.
[3] A. T. Miller, S. Knoop, H. I. Christensen and P. K. Allen, "Automatic grasp planning using shape primitives," in IEEE International Conference on Robotics and Automation, vol. 2, pp. 1824-1829, 2003.
[4] A. T. Miller and P. K. Allen, "Graspit! A versatile simulator for robotic grasping," IEEE Robotics & Automation Magazine, vol. 11, no. 4, pp. 110-122, 2004.
[5] J. J. Rodrigues, J. S. Kim, M. Furukawa, J. Xavier, P. Aguiar and T. Kanade, "6D pose estimation of textureless shiny objects using random ferns for bin-picking," in IEEE International Conference on Intelligent Robots and Systems, pp. 3334-3341, 2012.
[6] D. Forsyth and J. Ponce, Computer vision: A modern approach. Prentice Hall, 2003.
[7] J. Baumgartl and D. Henrich, "Fast vision-based grasp and delivery planning for unknown objects," in German Conference on Robotics, pp. 1-5, 2012.
[8] P. V. Hough, "Method and means for recognizing complex patterns," U.S. Patent 3.069.654, 1962.
[9] C. Eppner and O. Brock, "Grasping unknown objects by exploiting shape adaptability and environmental constraints," in IEEE International Conference on Intelligent Robots and Systems, pp. 4000-4006, 2013.
[10] D. Holz, A. J. Trevor, M. Dixon, S. Gedikli, R. B. Rusu and S. Behnke, "Fast segmentation of rgb-d images for semantic scene understanding," in IEEE International Conference on Robotics and Automation, vol. 1, no. 6, 2012.
[11] T. Suzuki and T. Oka, "Grasping of unknown objects on a planar surface using a single depth image," in IEEE International Conference on Advanced Intelligent Mechatronics, pp. 572-577, 2016.
[12] M. A. Fischler and R. C. Bolles, "Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography," Communications of the ACM, vol. 24, no. 6, pp. 381-395, 1981.
[13] R. B. Rusu, Semantic 3D object maps for everyday robot manipulation. Springer, 2013.
[14] D. Katz, A. Venkatraman, M. Kazemi, J. A. Bagnell and A. Stentz, "Perceiving, learning, and exploiting object affordances for autonomous pile manipulation," Autonomous Robots, vol. 37, no. 4, pp. 369-382, 2014.
[15] I. Lenz, H. Lee and A. Saxena, "Deep learning for detecting robotic grasps," The International Journal of Robotics Research, vol. 34, no. 4-5, pp. 705-724, 2015.
[16] J. Redmon and A. Angelova, "Real-time grasp detection using convolutional neural networks," in IEEE International Conference on Robotics and Automation, pp. 1316-1322, 2015.
[17] A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Neural Information Processing Systems, vol. 25, 2012.
[18] S. Kumra and C. Kanan, "Robotic grasp detection using deep convolutional neural networks," in IEEE International Conference on Intelligent Robots and Systems, 2017.
[19] K. He, X. Zhang, S. Ren and J. Sun, "Deep residual learning for image recognition," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[20] L. Pinto and A. K. Gupta, "Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours," in IEEE International Conference on Robotics and Automation, pp. 3406-3413, 2016.
[21] J. Mahler and K. Goldberg, "Learning deep policies for robot bin picking by simulating robust grasping sequences," in Conference on robot learning, vol. 78, pp. 515-524, 2017.
[22] E. Coumans. "Bullet physics library." https://pybullet.org/wordpress/ (accessed May 17, 2022).
[23] J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea and K. Goldberg, "Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics," Robotics: Science and Systems, 2017.
[24] A. Zeng, S. Song, K.-T. Yu, E. Donlon, F. Hogan, M. Bauza, D. Ma, O. Taylor, M. Liu, E. Romo Grau, N. Fazeli, F. Alet, N. Dafle, R. Holladay, I. Morena, P. Nair, D. Green, I. Taylor, W. Liu and A. Rodriguez, "Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching," in IEEE International Conference on Robotics and Automation, pp. 1-8, 2018.
[25] M. Danielczuk, M. Matl, S. Gupta, A. Li, A. Lee, J. Mahler and K. Goldberg, "Segmenting unknown 3D objects from real depth images using mask r-cnn trained on synthetic data," in International Conference on Robotics and Automation, pp. 7283-7290, 2019.
[26] S. Back, J. Kim, R. Kang, S. Choi and K. Lee, "Segmenting unseen industrial components in a heavy clutter using rgb-d fusion and synthetic data," in IEEE International Conference on Image Processing, pp. 828-832, 2020.
[27] 李佳蓮, "以實例切割與夾取點生成卷積類神經網路應用於隨機堆疊物件之分類夾取," 碩士論文, 國立臺灣大學機械系, 台北市, 2020.[28] L. Yang. "Bpycv: Computer vision utils for blender." https://github.com/DIYer22/bpycv (accessed May 17, 2022).
[29] Y. Bengio, J. Louradour, R. Collobert and J. Weston, "Curriculum learning," in International Conference on Machine Learning, pp. 41-48, 2009.
[30] K. He, G. Gkioxari, P. Dollár and R. Girshick, "Mask r-cnn," in IEEE International Conference on Computer Vision, pp. 2961-2969, 2017.
[31] B. O. Community. "Blender—A 3D modelling and rendering package." https://www.blender.org/about/ (accessed May 17, 2022).
[32] S. Ren, K. He, R. Girshick and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Neural Information Processing Systems, vol. 28, 2015.
[33] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature pyramid networks for object detection," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117-2125, 2017.
[34] M. Everingham, L. Van Gool, C. K. Williams, J. Winn and A. Zisserman, "The pascal visual object classes (voc) challenge," International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010.
[35] B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel and A. M. Dollar, "The ycb object and model set: Towards common benchmarks for manipulation research," in International Conference on Advanced Robotics, pp. 510-517, 2015.
[36] T. Hodan, P. Haluza, Š. Obdržálek, J. Matas, M. Lourakis and X. Zabulis, "T-LESS: An rgb-d dataset for 6D pose estimation of texture-less objects," in IEEE Winter Conference on Applications of Computer Vision, pp. 880-888, 2017.
[37] B. Drost, M. Ulrich, P. Bergmann, P. Hartinger and C. Steger, "Introducing mvtec itodd-a dataset for 3d object recognition in industry," in IEEE International Conference on Computer Vision Workshops, pp. 2200-2208, 2017.
[38] S. Koch, A. Matveev, Z. Jiang, F. Williams, A. Artemov, E. Burnaev, M. Alexa, D. Zorin and D. Panozzo, "Abc: A big cad model dataset for geometric deep learning," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 9601-9611, 2019.
[39] D. Morrison, P. Corke and J. Leitner, "Learning robust, real-time, reactive robotic grasping," The International journal of robotics research, vol. 39, no. 2-3, pp. 183-201, 2020.
[40] D. Morrison, P. Corke and J. Leitner, "Closing the loop for robotic grasping: A real-time, generative grasp synthesis approach," Robotics: Science and Systems, 2018.
[41] Cornell University. "Cornell grasping dataset." https://www.kaggle.com/datasets/oneoneliu/cornell-grasp (accessed May 17, 2022).
[42] A. Depierre, E. Dellandréa and L. Chen, "Jacquard: A large scale dataset for robotic grasp detection," in IEEE International Conference on Intelligent Robots and Systems, pp. 3511-3516, 2018.
[43] M. Savva, A. X. Chang and P. Hanrahan, "Semantically-enriched 3D models for common-sense knowledge," in IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 24-31, 2015.
[44] J. Ku, A. Harakeh and S. L. Waslander, "In defense of classical image processing: Fast depth completion on the cpu," in Conference on Computer and Robot Vision, pp. 16-22, 2018.
[45] J.-Y. Zhu, T. Park, P. Isola and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in IEEE International Conference on Computer Vision, pp. 2223-2232, 2017.
[46] H. Cao, G. Chen, Z. Li, J. Lin and A. Knoll, "Lightweight convolutional neural network with gaussian-based grasping representation for robotic grasping detection," arXiv preprint arXiv:2101.10226, 2021.