跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.87) 您好!臺灣時間:2024/12/03 00:24
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃筱晴
研究生(外文):Huang, Siao-Cing
論文名稱:透過課程強化學習探索可移動物件與避障導航於多樣化 3D 環境搜救任務
論文名稱(外文):PokingBot: Exploring Movable Obstacles while Avoid Collisions in Diverse 3D Environments via Curriculum Reinforcement Learning for Search and Rescue Missions
指導教授:王學誠王學誠引用關係
指導教授(外文):Wang, Hsueh-Cheng
口試委員:王學誠楊谷洋吳毅成陳鴻文
口試委員(外文):Wang, Hsueh-ChengYoung, Kuu-YoungWu, I-ChenChen, Hung-Wen
口試日期:2022-07-11
學位類別:碩士
校院名稱:國立陽明交通大學
系所名稱:工學院機器人碩士學位學程
學門:工程學門
學類:其他工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:英文
論文頁數:59
中文關鍵詞:可移動式障礙物搜救機器人避障課程學習深度強化學習
外文關鍵詞:Movable ObstaclesSearch and Rescue RobotsCollision AvoidanceCurriculum LearningDeep Reinforcement Learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:151
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
摘要 i
Abstract ii
Acknowledgement iii
Table of Contents iv
List of Figures vi
List of Tables ix

1 Introduction ... 1
2 Related Work ... 4
2.1 Deep Reinforcement Learning for Navigation ... 4
2.2 Transfer Learning ... 5
2.3 Learning by Demonstration ... 6
2.4 Hindsight Experience Replay (HER) ...6
2.5 Curriculum Learning ...7
3 PokingBot System Descriptions ... 9
3.1 System Overview ... 9
3.2 System Requirements ... 9
3.3 Assumptions ... 10
3.4 Hardware ... 10
3.5 Manipulation ... 10
3.6 Perception ... 11
3.7 Robot Counterpart in Simulations ... 13
4 Curriculum Reinforcement Learning ... 15
4.1 Design Considerations ... 15
4.2 Reinforcement Learning Settings ... 16
4.3 Curriculum Training Processes ... 19
4.3.1 Literature Review ... 19
4.3.2 Designs ... 20
4.3.3 Curriculum Scheduler ... 21
4.3.4 Analysis of Stage 1 and Stage 2 ... 21
5 Diverse Environments and Difficulty Levels ... 26
5.1 Environment Description ... 26
5.2 Metrics for Quantitative Analysis of Difficulty ... 28
5.3 Curriculum Training in Various Environments ... 29
6 Navigation in Diverse Simulated Environment ... 41
6.1 Metrics ... 41
6.2 Baseline methods ... 41
6.3 Movable Obstacle Conditions ... 42
6.4 Navigation in DARPA Subterranean Challenge Urban Circuit Alpha Run ... 42
6.5 Environment Generalization ... 43
7 Real­world Experiments ... 49
7.1 Navigation in Matterport3D and Actual Counterpart Environments ... 49
8 Conclusions ... 53
References ...54
[1] M. Stilman, J.U. Schamburek et al., “Manipulation planning among movable obstacles,” in Proceedings 2007 IEEE international conference on robotics and automation. IEEE,
2007, pp. 3327–3332.

[2] M. Stilman and J. Kuffner, “Planning among movable obstacles with artificial constraints,”
The International Journal of Robotics Research, vol. 27, no. 1112, pp. 1295–1307, 2008.

[3] J. v. d. Berg, M. Stilman et al., “Path planning among movable obstacles: a probabilistically complete approach,” in Algorithmic Foundation of Robotics VIII. Springer, 2009, pp. 599–614.
[4] M. Levihn, J. Scholz, and M. Stilman, “Hierarchical decision-theoretic planning for navigation among movable obstacles,” in Algorithmic Foundations of Robotics X. Springer,
2013, pp. 19–35.
[5] C.Devin, A.Gupta et al., “Learning modular neural network policies for multi-task and multi-robot transfer,” arXiv preprint arXiv:1609.07088, 2016.
[6] K. Pertsch, Y. Lee, and J. J. Lim, “Accelerating reinforcement learning with learned skill priors,” 2020. [Online]. Available: https://arxiv.org/abs/2010.11944
[7] M. Pfeiffer, S. Shukla et al., “Reinforced imitation: Sample efficient deep reinforcement
learning for mapless navigation by leveraging prior demonstrations,” CoRR, vol. abs/ 1805.07095, 2018. [Online]. Available: http://arxiv.org/abs/1805.07095
[8] J. Ibarz, J. Tan et al., “How to train your robot with deep reinforcement learning; lessons we’ve learned,” CoRR, vol. abs/2102.02915, 2021. [Online]. Available:
https://arxiv.org/abs/2102.02915
[9] F. Xia, C. Li et al., “Relmogen: Integrating motion generation in reinforcement learning
for mobile manipulation,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 4583–4590.
53
[10] C. Li, F. Xia et al., “Hrl4in: Hierarchical reinforcement learning for interactive navigation
with mobile manipulators,” in Conference on Robot Learning. PMLR, 2020, pp. 603–616.
[11] J. L. Elman, “Learning and development in neural networks: The importance of starting
small,” Cognition, vol. 48, no. 1, pp. 71–99, 1993.
[12] Y. Bengio, J. Louradour et al., “Curriculum learning,” in Proceedings of the 26th annual international conference on machine learning, 2009, pp. 41–48.
[13] P. Mirowski, M. K. Grimes et al., “Learning to navigate in cities without a map,” CoRR,
vol. abs/1804.00168, 2018. [Online]. Available: http://arxiv.org/abs/1804.00168
[14] T. C. J. S. Tambet Matiisen, Avital Oliver, “Teacherstudent curriculum learning,” arXiv
preprint arXiv:2103.14666, 2017.
[15] C. Cao, H. Zhu et al., “Tare: A hierarchical framework for efficiently exploring complex
3d environments,” in Robotics: Science and Systems Conference (RSS), Virtual, 2021.
[16] M. Pfeiffer, M. Schaeuble et al., “From perception to a decision: A data-driven approach
to end-to-end motion planning for autonomous ground robots,” in 2017 ieee international
conference on robotics and automation (icra). IEEE, 2017, pp. 1527–1533.
[17] M. Pfeiffer, S. Shukla et al., “Reinforced imitation: Sample efficient deep reinforcement
learning for mapless navigation by leveraging prior demonstrations,” IEEE Robotics and
Automation Letters, vol. 3, no. 4, pp. 4423–4430, 2018.
[18] L. Tai, G. Paolo, and M. Liu, “Virtualtoreal deep reinforcement learning: Continuous
control of mobile robots for mapless navigation,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017, pp. 31–36.
[19] T. Shan and B. Englot, “LeGOLOAM: Lightweight and ground optimized lidar odometry
and mapping on variable terrain,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 4758–4765.
[20] G. Kahn, P. Abbeel, and S. Levine, “Badgr: An autonomous self-supervised learning
based navigation system,” 2020. [Online]. Available: https://arxiv.org/abs/2002.05700
54
[21] F. Niroui, K. Zhang et al., “Deep reinforcement learning robot for search and rescue applications: Exploration in unknown cluttered environments,” IEEE Robotics and Automation
Letters, vol. 4, no. 2, pp. 610–617, 2019.
[22] T. P. Lillicrap, J. J. Hunt et al., “Continuous control with deep reinforcement learning.” in
ICLR, Y. Bengio and Y. LeCun, Eds., 2016.
[23] S. Gu, E. Holly et al., “Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates,” in 2017 IEEE international conference on robotics and automation (ICRA). IEEE, 2017, pp. 3389–3396.
[24] J.T. Huang, C.L. Lu et al., “Crossmodal contrastive learning of representations for navigation using lightweight, low-cost millimeter-wave radar for adverse environmental conditions,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 3333–3340, 2021.
[25] G. Kahn, P. Abbeel, and S. Levine, “Badgr: An autonomous self-supervised learning based
navigation system,” 2020.
[26] M. Everett, Y. F. Chen, and J. P. How, “Collision avoidance in pedestrian rich environments
with deep reinforcement learning,” IEEE Access, vol. 9, pp. 10 357–10 377, 2021.
[27] ——, “Motion planning among dynamic, decisionmaking agents with deep reinforcement
learning,” in IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), Madrid, Spain, Sep. 2018. [Online]. Available: https://arxiv.org/pdf/1805.01956.
pdf
[28] K. Pertsch, Y. Lee, and J. J. Lim, “Accelerating reinforcement learning with
learned skill priors,” CoRR, vol. abs/2010.11944, 2020. [Online]. Available: https://arxiv.org/abs/2010.11944
[29] R. Julian, B. Swanson et al., “Never stop learning: The effectiveness of finetuning in
robotic reinforcement learning,” arXiv preprint arXiv:2004.10190, 2020.
[30] A. A. Rusu, N. C. Raboinowitz et al., “Progressive neural networks,” arXiv preprint
arXiv:1606.04671, 2016.
55
[31] D. Borsa, A. Barreto et al., “Universal successor features approximators,” 2018. [Online].
Available: https://arxiv.org/abs/1812.07626
[32] M. Andrychowicz, F. Wolski et al., “Hindsight experience replay,” 2017. [Online].
Available: https://arxiv.org/abs/1707.01495
[33] C. Florensa, D. Held et al., “Reverse curriculum generation for reinforcement learning,”
2017. [Online]. Available: https://arxiv.org/abs/1707.05300
[34] P. Soviany, R. T. Ionescu et al., “Curriculum learning: A survey,” 2021. [Online].
Available: https://arxiv.org/abs/2101.10382
[35] M. Fang, T. Zhou et al., “Curriculumguided hindsight experience replay,” in Advances
in Neural Information Processing Systems, H. Wallach, H. Larochelle et al., Eds., vol. 32.
Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper/
2019/file/83715fd4755b33f9c3958e1a9ee221e1Paper.pdf
[36] B. Manela and A. Biess, “Curriculum learning with hindsight experience replay for
sequential object manipulation tasks,” 2020. [Online]. Available: https://arxiv.org/abs/
2008.09377
[37] M. Eppe, S. Magg, and S. Wermter, “Curriculum goal masking for continuous deep
reinforcement learning,” 2018. [Online]. Available: https://arxiv.org/abs/1809.06146
[38] T. P. Lillicrap, J. J. Hunt et al., “Continuous control with deep reinforcement learning,”
2015. [Online]. Available: https://arxiv.org/abs/1509.02971
[39] Y. Song, H. Lin et al., “Autonomous overtaking in
gran Turismo sport using curriculum
reinforcement learning,” arXiv preprint arXiv:2103.14666, 2021.
[40] T. D. Jonathan Long, Evan Shelhamer, “Fully convolutional networks for semantic segmentation,” in CVPR (2015). IEEE, 2015.
[41] T. Foote, “tf: The transform library,” in Technologies for Practical Robot Applications
(TePRA), 2013 IEEE International Conference on, ser. OpenSource Software workshop,
April 2013, pp. 1–6.
56
[42] A. Segal, D. Haehnel, and S. Thrun, “Generalizedicp.” in Robotics: science and systems,
vol. 2, no. 4. Seattle, WA, 2009, p. 435.
[43] R. B. Rusu and S. Cousins, “3D is here: Point Cloud Library (PCL),” in IEEE International
Conference on Robotics and Automation (ICRA), Shanghai, China, May 913 2011.
[44] W. B. Giorgio Grisetti, Cyrill Stachniss, “Improved techniques for grid mapping with rao blackwellized particle filters.” IEEE.
[45] G. BarthMaron, M. W. Hoffman et al., “Distributional policy gradients,” in
International Conference on Learning Representations, 2018. [Online]. Available:
https://openreview.net/forum?id=SyZipzbCb
[46] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an opensource multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS) (IEEE Cat. No.04CH37566), vol. 3, 2004, pp. 2149–2154 vol.3.
[47] G. Brockman, V. Cheung et al., “Openai gym,” 2016. [Online]. Available: https://arxiv.org/abs/1606.01540
[48] B. Tidd, N. Hudson, and A. Cosgun, “Guided curriculum learning for walking over
complex terrain,” 2020. [Online]. Available: https://arxiv.org/abs/2010.03848
[49] B. Manela and A. Biess, “Curriculum learning with hindsight experience replay for sequential object manipulation tasks,” 2020. [Online]. Available: https://arxiv.org/abs/2008.09377
[50] X. X. P. S. Daniel Perille, Abigail Truong, “Benchmarking metric ground navigation,”
arXiv arXiv:2008.13315, 2020.
[51] P. Cignoni, M. Callieri et al., “MeshLab: an OpenSource Mesh Processing Tool,” in Eu
Eurographics Italian Chapter Conference, V. Scarano, R. D. Chiara, and U. Erra, Eds. The
Eurographics Association, 2008.
[52] C. Cao, H. Zhu et al., “Autonomous exploration development environment and the
planning algorithms,” 2021. [Online]. Available: https://arxiv.org/abs/2110.14573
57
[53] C.L. Lu, Z.Y. Liu et al., “Assistive navigation using deep reinforcement learning
guiding robot with uwb/voice beacons and semantic feedbacks for blind and visually
impaired people,” Frontiers in Robotics and AI, vol. 8, 2021. [Online]. Available: https://www.frontiersin.org/article/10.3389/frobt.2021.65413258
電子全文 電子全文(網際網路公開日期:20270807)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊