跳到主要內容

臺灣博碩士論文加值系統

(44.210.83.132) 您好!臺灣時間:2024/05/22 22:45
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:蔣亦修
研究生(外文):Yi-Shiu Chiang
論文名稱:從人機互動中觀察人員的注意力反應以達成機器人提供服務行為之調適
論文名稱(外文):Adapting Robot Behaviors for Providing Services through Observing Human''s Attention Responses from Human-Robot Interactions
指導教授:傅立成傅立成引用關係
指導教授(外文):Li-Chen Fu
口試委員:黃漢邦簡忠漢連豊力王傑智
口試委員(外文):Han-Pang HuangChung-Han ChienFeng-Li LianChieh-Chih Wang
口試日期:2014-07-22
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:英文
論文頁數:79
中文關鍵詞:機器人行為調適互動學習動態貝氏網路人機互動日常生活輔助
外文關鍵詞:Robot Behavior AdaptationInteractively LearningDynamic Bayesian NetworkHuman-robot InteractionRobots in Daily Life
相關次數:
  • 被引用被引用:0
  • 點閱點閱:166
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
當機器人服務於家庭、安養中心等私人環境下,如何讓機器人考慮個人之喜好以維持人與機器人間的社交規範將成為一個重要的議題。在人與人的互動之中,人們總能不自覺地學習每個人潛在的社交規則;相對的,機器人卻難以在互動的過程中察言觀色,以致於打擾到當下人們的活動,進而造成不愉快的互動體驗。為了賦予機器人如此的社交能力,在本篇碩士論文提出了一個人員感知之互動學習架構,以此來建構機器人了解自身行為對於使用者的干擾程度,並在推論使用者社交注意力之同時最佳化自身的服務提供行為。以此為目標,我們提出了一個人員感知之馬可夫決策過程 (Human-Aware Markov Decision Process) 來描述此類需同時規劃機器人之行為並推論使用者之社交注意力之問題。對於社交注意力模型,我們採用了動態貝氏網路 (Dynamic Bayesian Network) 來推論使用者察覺機器人存在之機率。此外,使用者之社交注意力,機器人自身對於使用者察覺之揣測,以及機器人行為動作此三項相互間的關聯性則由增強式學習法 (Reinforcement Learning) 來進行探索發掘。同時地,為了達成更自然的機器人互動,使用者當下之心情回饋則由以身體姿態為基礎之情感辨識來粗略地估計出來。本篇論文之最後則進行了數個以實際社交情境為基礎之實驗,以驗證我們所提出之人員感知互動學習架構。

Robots that service humans in private places, such as homes or senior centers, must consider humans'' preferences to behave in a socially acceptable manner. Human beings subconsciously adapt their actions to start a conversation according to the historical interaction experiences, but robots often fail to do this and result in disrupting their users. To endow service robots with such socially acceptable ability, this thesis proposes an online human-aware interactive learning framework, under which the robot behaves so as to optimize its service providing behavior while inferring user''s awareness of robot itself. To this purpose, a human-aware Markov decision process (HAMDP) is proposed to model this kind of problem, which requires planning of robot actions and inference on user''s social attention concurrently. For social attention inference model, it is based on a Dynamic Bayesian Network (DBN), which is also employed to infer the possibility of user''s awareness of the robot, extit{i.e.} the robot''s theory of awareness. The correlation between the robot''s theory of awareness, the user''s social attention, and the robot behavior are explored through reinforcement learning. Besides, to let the robot behave more naturally, the mood of the user is estimated by recognizing his/her body gesture based gross affective state. In order to verify the effectiveness of our proposed framework, experiments with real social scenarios have been conducted.

口試委員會審定書 i
誌謝 ii
摘要 iv
Abstract v
Contents vi
List of Figures ix
List of Tables xiii
1 Introduction 1
1.1 Background and Related Work 3
1.1.1 Initiating Human-Robot Engagements 3
1.1.2 Adapting Robot Behaviors to User Preferences 4
1.1.3 Robots for the Elderly 5
1.2 Objective and Contributions 7
1.3 Thesis Organization 9
2 Preliminaries 10
2.1 Machine Perception of Humans 10
2.1.1 Human Position Detection and Tracking 10
2.1.2 Human-Context Estimation 13
2.1.3 Multi-Human Spatial Social Patterns 15
2.2 Markov Models 17
2.2.1 Dynamic Bayesian Networks 18
2.2.2 Markov Decision Processes 20
2.3 Reinforcement Learning 22
2.3.1 Standard Modeling 22
2.3.2 Value Functions and Action-Value Functions 23
2.3.3 Q-Learning 25
3 Interactive Learning Framework for Service Providing 26
3.1 Service Robots in Private Spaces 26
3.2 Human-Aware Markov Decision Processes 28
3.2.1 Human’s Attention to the Robot 31
3.2.2 Human Task Context of the Human Partner 33
3.2.3 Robot’s Theory of Awareness 34
3.3 System Architecture 35
3.4 Social Attention Inference Model 37
3.4.1 Model Parameters 37
3.4.2 Learning and Inference 40
3.5 Pleasantness-Unpleasantness Estimation 42
3.6 Human-Aware Interactive Learner 45
3.6.1 Primitive Robot Actions 47
3.6.2 Deliberate Policy of Action Selection 48
3.7 Overall algorithm 50
4 Evaluation 54
4.1 Environment and Robotic Platform Settings 54
4.2 Inference from Human Nonverbal Responses 55
4.2.1 Experimental Settings and Data Collection 56
4.2.2 Result of Social Attention Model and Theory of Awareness 58
4.2.3 Result of Pleasantness-Unpleasantness Recognition 60
4.3 Performance of Interactive Learning Framework 61
4.4 Personalization for Service Providing Strategy 65
4.4.1 Experimental Settings 65
4.4.2 Result and Discussion 66
5 Conclusion 69
References 71

[1] B. Mutlu and J. Forlizzi, “Robots in organizations: The role of workflow, social, and environmental factors in human-robot interaction,” in Human-Robot Interaction (HRI), 2008 3rd ACM/IEEE International Conference on, March 2008, pp. 287–294.
[2] G. Diego and K. Arras, “Please do not disturb! minimum interference coverage for social robots,” in Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, Sept 2011, pp. 1968–1973.
[3] A. Gaschler, S. Jentzsch, M. Giuliani, K. Huth, J. de Ruiter, and A. Knoll, “Social behavior recognition using body posture and head pose for human-robot interaction,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, Oct 2012, pp. 2128–2133.
[4] M. P. Michalowski, S. Sabanovic, and R. Simmons, “A spatial model of engagement for a social robot,” in Advanced Motion Control, 2006. 9th IEEE International Workshop on. IEEE, 2006, pp. 762–767.
[5] L. Morency, “Modeling human communication dynamics [social sciences],” Signal Processing Magazine, IEEE, vol. 27, no. 5, pp. 112–116, Sept 2010.
[6] C. L. Sidner, C. Lee, C. D. Kidd, N. Lesh, and C. Rich, “Explorations in engagement for humans and robots,” Artificial Intelligence, vol. 166, no. 1–2, pp. 140 – 164, 2005. [Online]. Available: http://www.sciencedirect.com/science/article/ pii/S0004370205000512 71
[7] E. Horvitz, J. Apacible, and M. Subramani, “Balancing awareness and interruption: Investigation of notification deferral policies,” in User Modeling 2005, ser. Lecture Notes in Computer Science, L. Ardissono, P. Brna, and A. Mitrovic, Eds. Springer Berlin Heidelberg, 2005, vol. 3538, pp. 433–437. [Online]. Available: http://dx.doi.org/10.1007/11527886_59
[8] A. Cesta, G. Cortellessa, V. Giuliani, F. Pecora, R. Rasconi, M. Scopelliti, and L. Tiberio, “Proactive assistive technology: An empirical study,” in Human-Computer Interaction –INTERACT 2007, ser. Lecture Notes in Computer Science, C. Baranauskas, P. Palanque, J. Abascal, and S. Barbosa, Eds. Springer Berlin Heidelberg, 2007, vol. 4662, pp. 255–268. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-74796-3_25
[9] R. BRAUNE and C. D. WICKENS, “Time-sharing revisited: Test of a componential model for the assessment of individual differences,” Ergonomics, vol. 29, no. 11, pp. 1399–1414, 1986. [Online]. Available: http://dx.doi.org/10. 1080/00140138608967254
[10] K. A. Morrin, D. J. Law, and J. W. Pellegrino, “Structural modeling of information coordination abilities: An evaluation and extension of the yee, hunt, and pellegrino model,” Intelligence, vol. 19, no. 1, pp. 117 – 144, 1994. [Online]. Available: http://www.sciencedirect.com/science/article/pii/0160289694900574
[11] S. Joslyn and E. Hunt, “Evaluating individual differences in response to timepressure situations.” Journal of Experimental Psychology: Applied, vol. 4, no. 1, p. 16, 1998.
[12] A. Cesta, G. Cortellessa, V. Giuliani, F. Pecora, M. Scopelliti, and L. Tiberio, “Psychological implications of domestic assistive technology for the elderly,” PsychNology Journal, vol. 5, no. 3, pp. 229–252, 2007.
[13] R. W. Pew, S. B. Van Hemel et al., Technology for adaptive aging. National Academies Press, 2004. 72
[14] M. Finke, K. Koay, K. Dautenhahn, C. Nehaniv, M. Walters, and J. Saunders, “Hey, i’m over here - how can a robot attract people’s attention?” in Robot and Human Interactive Communication, 2005. ROMAN 2005. IEEE International Workshop on, Aug 2005, pp. 7–12.
[15] S. Koo and D.-S. Kwon, “Recognizing human intentional actions from the relative movements between human and robot,” in Robot and Human Interactive Communication, 2009. RO-MAN 2009. The 18th IEEE International Symposium on, Sept 2009, pp. 939–944.
[16] K. S. Ong, Y. H. Hsu, and L. C. Fu, “Sensor fusion based human detection and tracking system for human-robot interaction,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, Oct 2012, pp. 4835–4840.
[17] A. Gaschler, K. Huth, M. Giuliani, I. Kessler, J. de Ruiter, and A. Knoll, “Modelling state of interaction from head poses for social human-robot interaction,” in Proc. of the Gaze in Human-Robot Interaction Workshop, HRI, 2012.
[18] T. Kanda, D. Glas, M. Shiomi, and N. Hagita, “Abstracting people’s trajectories for social robots to proactively approach customers,” Robotics, IEEE Transactions on, vol. 25, no. 6, pp. 1382–1396, Dec 2009.
[19] D. Macharet and D. Florencio, “Learning how to increase the chance of human-robot engagement,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on, Nov 2013, pp. 2173–2179.
[20] A. B. Karami and A.-I. Mouaddib, “A decision model of adaptive interaction selection for a robot companion,” in Proceedings of the European Conference on Mobile Robots (ECMR), 2011.
[21] E. Sisbot, L. Marin-Urias, X. Broquère, D. Sidobre, and R. Alami, “Synthesizing robot motions adapted to human presence,” International Journal of Social Robotics, vol. 2, no. 3, pp. 329–343, 2010. [Online]. Available: http: //dx.doi.org/10.1007/s12369-010-0059-6 73
[22] A. Karami, K. Sehaba, and B. Encelle, “Adaptive and personalised robots - learning from users’ feedback,” in Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on, Nov 2013, pp. 626–632.
[23] N. Mitsunaga, C. Smith, T. Kanda, H. Ishiguro, and N. Hagita, “Adapting robot behavior for human–robot interaction,” Robotics, IEEE Transactions on, vol. 24, no. 4, pp. 911–916, Aug 2008.
[24] A. Tapus, C. Ţăpuş, and M. Matarić, “User—robot personality matching and assistive robot behavior adaptation for post-stroke rehabilitation therapy,” Intelligent Service Robotics, vol. 1, no. 2, pp. 169–183, 2008. [Online]. Available: http://dx.doi.org/10.1007/s11370-008-0017-4
[25] G. Nejat and M. Ficocelli, “Can i be of assistance? the intelligence behind an assistive robot,” in Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on, May 2008, pp. 3564–3569.
[26] D. Das, Y. Kobayashi, and Y. Kuno, “Attracting attention and establishing a communication channel based on the level of visual focus of attention,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on, Nov 2013, pp. 2194–2201.
[27] P. Saulnier, E. Sharlin, and S. Greenberg, “Exploring minimal nonverbal interruption in hri,” in RO-MAN, 2011 IEEE, July 2011, pp. 79–86.
[28] J. Broekens, M. Heerink, and H. Rosendal, “Assistive social robots in elderly care: a review,” Gerontechnology, vol. 8, no. 2, pp. 94–103, 2009.
[29] W.-H. Mou, M.-F. Chang, C.-K. Liao, Y.-H. Hsu, S.-H. Tseng, and L.-C. Fu, “Context-aware assisted interactive robotic walker for parkinson’s disease patients,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, Oct 2012, pp. 329–334.
[30] K. Wada and T. Shibata, “Robot therapy in a care house - its sociopsychological and physiological effects on the residents,” in Robotics and Automation, 2006. 74 ICRA 2006. Proceedings 2006 IEEE International Conference on, May 2006, pp. 3966–3971.
[31] D. Feil-Seifer and M. Mataric, “Defining socially assistive robotics,” in Rehabilitation Robotics, 2005. ICORR 2005. 9th International Conference on, June 2005, pp. 465–468.
[32] J. Chan and G. Nejat, “Social intelligence for a robot engaging people in cognitive training activities.” International Journal of Advanced Robotic Systems, vol. 9, p. 1, 2012.
[33] J. Fasola and M. Mataric, “A socially assistive robot exercise coach for the elderly,” Journal of Human-Robot Interaction, vol. 2, no. 2, pp. 3–32, 2013.
[34] D. McColl and G. Nejat, “Meal-time with a socially assistive robot and older adults at a long-term care facility,” Journal of Human-Robot Interaction, vol. 2, no. 1, pp. 152–171, 2013.
[35] B. Graf, M. Hans, and R. Schraft, “Care-o-bot ii—development of a next generation robotic home assistant,” Autonomous Robots, vol. 16, no. 2, pp. 193–205, 2004. [Online]. Available: http://dx.doi.org/10.1023/B%3AAURO.0000016865.35796.e9
[36] C. Jayawardena, I. H. Kuo, U. Unger, A. Igic, R. Wong, C. Watson, R. Stafford, E. Broadbent, P. Tiwari, J. Warren, J. Sohn, and B. MacDonald, “Deployment of a service robot to help older people,” in Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, Oct 2010, pp. 5990–5995.
[37] A. Prakash, J. Beer, T. Deyle, C.-A. Smarr, T. Chen, T. Mitzner, C. Kemp, and W. Rogers, “Older adults’ medication management in the home: How can robots help?” in Human-Robot Interaction (HRI), 2013 8th ACM/IEEE International Conference on, March 2013, pp. 283–290.
[38] M. Heerink, “How elderly users of a socially interactive robot experience adaptiveness, adaptability and user control,” in Computational Intelligence and Informatics (CINTI), 2011 IEEE 12th International Symposium on, Nov 2011, pp. 79–84. 75
[39] J. Pineau, M. Montemerlo, M. Pollack, N. Roy, and S. Thrun, “Towards robotic assistants in nursing homes: Challenges and results,” Robotics and Autonomous Systems, vol. 42, no. 3–4, pp. 271 – 281, 2003, socially Interactive Robots. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0921889002003810
[40] C. Peters, “Direction of attention perception for conversation initiation in virtual environments,” in Intelligent Virtual Agents, ser. Lecture Notes in Computer Science, T. Panayiotopoulos, J. Gratch, R. Aylett, D. Ballin, P. Olivier, and T. Rist, Eds. Springer Berlin Heidelberg, 2005, vol. 3661, pp. 215–228. [Online]. Available: http://dx.doi.org/10.1007/11550617_19
[41] N. Bellotto and H. Hu, “Multisensor-based human detection and tracking for mobile service robots,” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 39, no. 1, pp. 167–181, Feb 2009.
[42] A. Fod, A. Howard, and M. Mataric, “A laser-based people tracker,” in Robotics and Automation, 2002. Proceedings. ICRA ’02. IEEE International Conference on, vol. 3, 2002, pp. 3024–3029.
[43] C.-T. Chou, J.-Y. Li, M.-F. Chang, and L. C. Fu, “Multi-robot cooperation based human tracking system using laser range finder,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on, May 2011, pp. 532–537.
[44] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking,” Signal Processing, IEEE Transactions on, vol. 50, no. 2, pp. 174–188, Feb 2002.
[45] K.-T. Yu, “Human-context mapping for human-centric robot service in office environment,” Master’s thesis, National Taiwan University, 7 2012.
[46] H. G. Wallbott, “Bodily expression of emotion,” European journal of social psychology, vol. 28, no. 6, pp. 879–896, 1998. 76
[47] Y.-H. Hsu, “Social awareness multi-human intention inference and adaptive robotic decision strategy in an office environment,” Master’s thesis, National Taiwan University, 7 2013.
[48] T. M. Ciolek and A. Kendon, “Environment and the spatial arrangement of conversational encounters,” Sociological Inquiry, vol. 50, no. 3-4, pp. 237–271, 1980. [Online]. Available: http://dx.doi.org/10.1111/j.1475-682X.1980.tb00022.x
[49] P. Contact, M. Sajjad, H. Dr, S. Haider, S. Haider, S. Haider, A. K. Zaidi, A. K. Zaidi, and A. K. Zaidi, “Transforming timed influence nets into time sliced bayesian networks,” in Proceedings of 2004 Command and Control Research and Technology Symposium, 2004.
[50] K. P. Murphy, “Dynamic bayesian networks: representation, inference and learning,” Ph.D. dissertation, University of California, 2002.
[51] J. Pearl, “Fusion, propagation, and structuring in belief networks,” Artificial Intelligence, vol. 29, no. 3, pp. 241 – 288, 1986. [Online]. Available: http://www.sciencedirect.com/science/article/pii/000437028690072X
[52] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 39, pp. 1–38, 1977.
[53] C.-M. Huang and B. Mutlu, “Learning-based modeling of multimodal behaviors for humanlike robots,” in Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, ser. HRI ’14. New York, NY, USA: ACM, 2014, pp. 57–64. [Online]. Available: http://doi.acm.org/10.1145/2559636.2559668
[54] L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning and acting in partially observable stochastic domains,” Artificial Intelligence, vol. 101, no. 1, pp. 99 – 134, 1998. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/S000437029800023X 77
[55] R. Bellman, Dynamic Programming, 1st ed. Princeton, NJ, USA: Princeton University Press, 1957. [Online]. Available: http://books.google.com/books?id=fyVtp3EMxasC&pg=PR5&dq=dynamic+ programming+richard+e+bellman&client=firefox-a#v=onepage&q=dynamic% 20programming%20richard%20e%20bellman&f=false
[56] M. L. Puterman, Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2009.
[57] R. S. Sutton and A. G. Barto, Introduction to Reinforcement Learning, 1st ed. Cambridge, MA, USA: MIT Press, 1998.
[58] C. J. C. H. Watkins, “Learning from delayed rewards.” Ph.D. dissertation, University of Cambridge, 1989.
[59] C. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, 1992. [Online]. Available: http://dx.doi.org/10.1007/BF00992698
[60] M. K. Lee, J. Forlizzi, S. Kiesler, P. Rybski, J. Antanitis, and S. Savetsila, “Personalization in hri: A longitudinal field experiment,” in Human-Robot Interaction (HRI), 2012 7th ACM/IEEE International Conference on, March 2012, pp. 319–326.
[61] K. Gwinner, D. Gremler, and M. Bitner, “Relational benefits in services industries: The customer’s perspective,” Journal of the Academy of Marketing Science, vol. 26, no. 2, pp. 101–114, 1998. [Online]. Available: http: //dx.doi.org/10.1177/0092070398262002
[62] D. Perrett, J. Hietanen, M. Oram, P. Benson, and E. Rolls, “Organization and functions of cells responsive to faces in the temporal cortex [and discussion],” Philosophical transactions of the royal society of London. Series B: Biological sciences, vol. 335, no. 1273, pp. 23–30, 1992.
[63] J. Fogarty, S. E. Hudson, C. G. Atkeson, D. Avrahami, J. Forlizzi, S. Kiesler, J. C. Lee, and J. Yang, “Predicting human interruptibility with sensors,” ACM 78 Trans. Comput.-Hum. Interact., vol. 12, no. 1, pp. 119–146, Mar. 2005. [Online]. Available: http://doi.acm.org/10.1145/1057237.1057243
[64] P. Ekman and W. V. Friesen, “Head and body cues in the judgment of emotion: A reformulation,” Perceptual and motor skills, vol. 24, no. 3, pp. 711–724, 1967.
[65] M. Davis and D. Hadiks, “Nonverbal behavior and client state changes during psychotherapy,” Journal of Clinical Psychology, vol. 46, no. 3, pp. 340–351, 1990. [Online]. Available: http://dx.doi.org/10.1002/1097-4679(199005)46:3<340:: AID-JCLP2270460315>3.0.CO;2-1
[66] D. McColl, Z. Zhang, and G. Nejat, “Human body pose interpretation and classification for social human-robot interaction,” International Journal of Social Robotics, vol. 3, no. 3, pp. 313–332, 2011. [Online]. Available: http://dx.doi.org/10.1007/s12369-011-0099-6
[67] A. Bruce, I. Nourbakhsh, and R. Simmons, “The role of expressiveness and attention in human-robot interaction,” in Robotics and Automation, 2002. Proceedings. ICRA ’02. IEEE International Conference on, vol. 4, 2002, pp. 4138–4142 vol.4.
[68] H. Edward, The hidden dimension. Anchor Books New York, 1969, vol. 1990.
[69] S. Griffith, K. Subramanian, J. Scholz, C. Isbell, and A. L. Thomaz, “Policy shaping: Integrating human feedback with reinforcement learning,” in Advances in Neural Information Processing Systems 26, C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, Eds. Curran Associates, Inc., 2013, pp. 2625–2633. [Online]. Available: http://papers.nips.cc/paper/ 5187-policy-shaping-integrating-human-feedback-with-reinforcement-learning.pdf

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top