跳到主要內容

臺灣博碩士論文加值系統

(98.80.143.34) 您好!臺灣時間:2024/10/10 15:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:魏妤安
研究生(外文):Wei, Yu-An
論文名稱:基於深度資訊之西洋棋辨識
論文名稱(外文):Recognizing Chess Pieces from Depth Cues
指導教授:陳煥宗
指導教授(外文):Chen, Hwann-Tzong
口試委員:賴尚宏劉庭祿
口試委員(外文):Lai, Shang-HongLiu, Tyng-Luh
口試日期:2017-06-29
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:29
中文關鍵詞:三維物件辨識方格表示法神經網絡
外文關鍵詞:3D object recognitionvolumetric representationconvolutional neural networks
相關次數:
  • 被引用被引用:0
  • 點閱點閱:388
  • 評分評分:
  • 下載下載:52
  • 收藏至我的研究室書目清單書目收藏:0
本篇論文提出了一套利用深度資訊辨識西洋棋類別的機器學習方法,並將提出的方法整合進專門與人類下棋的陪伴型機器人中。此機器人有兩隻機械手臂可夾取西洋棋子,並配戴Ensenso N35 的3D深度攝影機取像以供辨識。本篇論文的目標為提出一個機器視覺智能方法,利用3D攝影機拍出的深度圖辨識出在棋盤上的棋子。我們建立了一個卷積神經網絡來解決3D物體辨識的問題。近期利用卷積神經網絡解決3D物體辨識的應用越來越盛行,但要蒐集到足夠的訓練資料卻是一項非常耗時的工作,因此,我們採用3D電腦輔助模型來產生訓練資料,這樣的做法不僅方便又很省時,能夠有效解決訓練資料不足的問題。卷積神經網絡利用這些生成的訓練資料訓練,但使用真正拍攝的資料做測試。我們的訓練資料基於許多描述不同變異的參數設定來產生,從本篇論文的實驗結果可以驗證,使用不同變異產生的訓練資料能明顯地提高準確度,在使用變異的訓練資料下測試從3D攝影機拍攝的真實資料,最高可以達到90% 的準確度。
This thesis presents a learning-based method for recognizing chess pieces from depth information. The proposed method is integrated in a recreational robotic system that is designed to play games of chess against humans. The robot has two arms and an Ensenso N35 Stereo 3D camera. Our goal is to provide the robot visual intelligence so that it can identify the chess pieces on the chessboard using the depth information captured by the 3D camera.
We build a convolutional neural network to solve this 3D object recognition problem.
While training neural networks for 3D object recognition becomes popular these days, collecting enough training data is still a time-consuming task. We demonstrate that it is much more convenient and effective to generate the required training data from 3D CAD models. The neural network trained using the rendered data performs well on real inputs during testing. More specifically, the experimental results show that using the training data rendered from the CAD models under various conditions enhances the recognition accuracy significantly. When further evaluations are done on real data captured by the 3D camera, our method achieves 90.3% accuracy.
1 Introduction 8
2 Related Work 12
2.1 CNN on 2D Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 CNNs on Depth and 3D Data . . . . . . . . . . . . . . . . . . . . . . 13
2.3 3D Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Methods 15
3.1 Volumetric Convolutional Neural Network . . . . . . . . . . . . . . . 15
3.1.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.2 Training the Network . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Gathering Test Data . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Synthesizing Training Data . . . . . . . . . . . . . . . . . . . 18
4 Experiments 21
4.1 Importance of Height Variations . . . . . . . . . . . . . . . . . . . . . 21
4.2 Analysis on Shape Variations . . . . . . . . . . . . . . . . . . . . . . 21
4.3 System Integration and Improvement . . . . . . . . . . . . . . . . . . 23
5 Conclusion 25
[1] A. Brock, T. Lim, J. M. Ritchie, and N. Weston. Generative and discriminative voxel modeling with convolutional neural networks. CoRR, abs/1608.04236,2016.
[2] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li,S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu. ShapeNet:An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012[cs.GR], 2015.
[3] R. Collobert, K. Kavukcuoglu, and C. Farabet. Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop, 2011.
[4] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li. Imagenet: A large scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009,Miami, Florida, USA, pages 248{255, 2009.
[5] G. Georgakis, M. A. Reza, A. Mousavian, P. Le, and J. Kosecka. Multiview RGB-D dataset for object instance detection. In Fourth International Conference on 3D Vision, 3DV 2016, Stanford, CA, USA, October 25-28, 2016, pages 426-434, 2016.
[6] R. B. Girshick. Fast R-CNN. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 1440-1448, 2015.
[7] R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014, pages 580{587, 2014.
[8] S. Gupta, R. B. Girshick, P. A. Arbelaez, and J. Malik. Learning rich features from RGB-D images for object detection and segmentation. In Computer Vision- ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII, pages 345-360, 2014.
[9] V. Hegde and R. Zadeh. Fusionnet: 3d object classication using multiple data representations. CoRR, abs/1607.05695, 2016.
[10] A. Janoch. The berkeley 3d object dataset. Master's thesis, 2012.
[11] A. Janoch, S. Karayev, Y. Jia, J. T. Barron, M. Fritz, K. Saenko, and T. Darrell. A category-level 3-d object dataset: Putting the kinect to work. In IEEE International Conference on Computer Vision Workshops, Barcelona, Spain, November 6-13, 2011, pages 1168-1174, 2011.
[12] K. Lai, L. Bo, and D. Fox. Unsupervised feature learning for 3d scene labeling. In 2014 IEEE International Conference on Robotics and Automation, ICRA 2014, Hong Kong, China, May 31 - June 7, 2014, pages 3050-3057, 2014.
[13] Y. LeCun, L. Bottou, Y. Bengio, and P. Haner. Gradient-based learning applied to document recognition. In Intelligent Signal Processing, pages 306-351. IEEE Press, 2001.
[14] J. J. Lim, H. Pirsiavash, and A. Torralba. Parsing IKEA Objects: Fine Pose Estimation. ICCV, 2013.
[15] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. E. Reed, C. Fu, and A. C. Berg. SSD: single shot multibox detector. In Computer Vision - ECCV 2016 -14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part I, pages 21{37, 2016.
[16] D. Maturana and S. Scherer. Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, September 28- October 2, 2015, pages 922-928, 2015.
[17] C. R. Qi, H. Su, M. Niener, A. Dai, M. Yan, and L. J. Guibas. Volumetric and multi-view cnns for object classication on 3d data. CoRR, 2016.
[18] J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi. You only look once: Unied, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 779-788, 2016.
[19] S. Ren, K. He, R. B. Girshick, and J. Sun. Faster R-CNN: towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems 28, pages 91{99, 2015.
[20] Z. Ren and E. B. Sudderth. Three-dimensional object detection and layout prediction using clouds of oriented gradients. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
[21] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. Indoor segmentation and support inference from RGBD images. In Computer Vision - ECCV 2012 - 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V, pages 746{760, 2012.
[22] R. Socher, B. Huval, B. P. Bath, C. D. Manning, and A. Y. Ng. Convolutional recursive deep learning for 3d object classication. In Advances in Neural Information Processing Systems 25, pages 665{673, 2012.
[23] S. Song, S. P. Lichtenberg, and J. Xiao. Sun rgb-d: A rgb-d scene understanding benchmark suite. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
[24] S. Song and J. Xiao. Sliding shapes for 3d object detection in depth images. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI, pages 634-651, 2014.
[25] H. Su, S. Maji, E. Kalogerakis, and E. G. Learned-Miller. Multi-view convolutional neural networks for 3d shape recognition. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 945-953, 2015.
[26] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3d shapenets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pages 1912{1920, 2015.
[27] Y. Xiang, W. Kim, W. Chen, J. Ji, C. Choy, H. Su, R. Mottaghi, L. Guibas, and S. Savarese. Objectnet3d: A large scale database for 3d object recognition. In European Conference Computer Vision (ECCV), 2016.
[28] Y. Xiang, R. Mottaghi, and S. Savarese. Beyond pascal: A benchmark for 3d object detection in the wild. In IEEE Winter Conference on Applications of Computer Vision (WACV), 2014.
[29] J. Xiao, A. Owens, and A. Torralba. SUN3D: A database of big spaces reconstructed using sfm and object labels. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013, pages 1625-1632, 2013.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top