(3.92.96.236) 您好!臺灣時間:2021/05/07 15:37
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:蔡冠廷
研究生(外文):Kuan-TingTsai
論文名稱:四何學習用於演變環境中的行為預測:以智慧管家為例
論文名稱(外文):Quadro-W Learning for Behavior Prediction in Evolved Environment: Case Study of Intelligent Butler
指導教授:鄭憲宗鄭憲宗引用關係
指導教授(外文):Sheng-Tzong Cheng
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:46
中文關鍵詞:影音識別行為預測Q學習智慧家庭
外文關鍵詞:Audiovisual RecognitionBehavior PredictionQ LearningIntelligent House
相關次數:
  • 被引用被引用:0
  • 點閱點閱:36
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來隨著嵌入式硬體設備(感測器、微處理器等)的進步、軟體技術的成熟、網際網路的普及與價格的下降,嵌入式系統被廣泛運用於各種場合,包含農作物的生長監控、商品缺陷檢測、交通系統管控與病人生命特徵監測等。然而為了獲得夠多所需資料以得到良好的分析結果,通常會在應用環境中佈滿各式各樣的感測器,這樣子就會衍生許多問題,例如:過多的硬體設備改變了原始環境結構、系統搭設太過耗時、大量的硬體設備使得系統價格高昂與維護困難等,所以應在不影響分析結果的情況下,減少所需的硬體設備來達到目的。此外各個應用場景除了搭設用來收集環境資料的硬體設備外,還有其對應用來分析資料的軟體模型,這些模型通常根據環境下一定的情形進行設定,無法隨著環境的演變而一起演變,模型無法自我學習,這也導致系統的彈性與生命週期降低。
在本研究中,我們提出了一個四何學習(Quadro-W Learning)方法來對人的行為進行預測,四何即為人(Who)、物(What)、地(Where)與時(Wen)。我們透過攝影機所得到的資料來獲得四何資訊,不額外使用多的感測硬體。並以四何資訊為基礎建構行為預測模型,此模型不僅能依照初始的環境進行預測,還能隨著人的生活習慣改變而對模型進行修改,以增加模型的可用性與彈性。
In recent years, with the progress of embedded hardware devices (sensors, microprocessors, etc.), the maturity of software technology, the popularity of the internet and the decline in prices, embedded systems have been widely used in various scenes. Including crop growth monitoring, defection detection of commodity, transportation system management and vital signs monitoring, etc. However, in order to obtain enough information to get good analysis, the application environment is usually full of various sensors. That will make many problems, such as: the original environment changed by hardware devices; the initial system setup is too time-consuming; many hardware devices make the system expensive and the system is difficult to maintain, etc. Without affecting the results, need to reduce the hardware equipment required to achieve the goal. In addition to the hardware equipment used to collect environmental information, each application scene has theirs corresponding software models for analyzing information. This model is usually set according to some situation in the environment, and it can’t evolve with the evolution of the environment. The model cannot learn by itself that leads to decrease the flexibility and life cycle of the system.
In this study, we propose a Quadro-W Learning (QW-Learning) method to predict human behavior. Quadro-W means human (Who), object (what), place (where) and time (when). We only obtain the Quadro-W information through the data which collected by the camera, and not use extra sensors. Build a behavior prediction model by Quadro-W information, this model can not only make predictions based on the initial environment. It can also evolve as evolved environment to increase the flexibility and life cycle.
摘要 I
Abstract II
ACKNOWLEDGEMENT IV
LIST OF CONTENTS V
LIST OF FIGURES VI
LIST OF TABLES VII
Chapter 1. Introduction 1
1.1 Introduction & Motivation 1
1.2 Thesis Overview 3
Chapter 2. Background & Related Work 4
2.1 Background 4
2.1.1 Action Recognition 4
2.1.2 Temporal Action Detection 8
2.2 Related Work 9
2.2.1 Residual Block 9
2.2.2 Q-Learning 12
Chapter 3. Method 14
3.1 Problem Description 14
3.2 System Architecture 15
3.3 Data Pre-processing 17
3.4 Quadro-W Model 19
3.4.1 Human Detection & Recognition 19
3.4.2 Object Detection & Recognition 21
3.4.3 Place Recognition 22
3.4.4 Sound Split & Recognition 23
3.5 Quadro-W Information Merge 25
3.6 Evolved Behavior Prediction 30
Chapter 4. Experiment 34
4.1 Experiment Environment Setup 34
4.2 Implementation 34
4.3 Experiment Result 35
Chapter 5. Conclusion & Future Work 43
Reference 45
[1]Library of Congress. Who is credited with inventing the telephone? https://www.loc.gov/everyday-mysteries/item/who-is-credited-with-inventing-the-telephone/ (accessed 07/01, 2020).
[2] H. Wang and C. Schmid, Action Recognition with Improved Trajectories, in 2013 IEEE International Conference on Computer Vision, 1-8 Dec. 2013 2013, pp. 3551-3558, doi: 10.1109/ICCV.2013.441.
[3] K. Simonyan and A. Zisserman, Two-stream convolutional networks for action recognition in videos, in Advances in neural information processing systems, 2014, pp. 568-576.
[4] J. Yue-Hei Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici, Beyond short snippets: Deep networks for video classification, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 4694-4702.
[5] Z. Qiu, T. Yao, and T. Mei, Learning spatio-temporal representation with pseudo-3d residual networks, in proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5533-5541.
[6] Z. Shou, D. Wang, and S.-F. Chang, Temporal action localization in untrimmed videos via multi-stage cnns, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1049-1058.
[7] T. Lin, X. Zhao, H. Su, C. Wang, and M. Yang, Bsn: Boundary sensitive network for temporal action proposal generation, in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3-19.
[8]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097-1105.
[10]K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, 2014.
[11] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[12] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117-2125.
[13] R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580-587.
[14]K. He, X. Zhang, S. Ren, and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 9, pp. 1904-1916, 2015.
[15] S. Ren, K. He, R. Girshick, and J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in Advances in neural information processing systems, 2015, pp. 91-99.
[16]C. J. Watkins and P. Dayan, Q-learning, Machine learning, vol. 8, no. 3-4, pp. 279-292, 1992.
[17]E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley, Color transfer between images, IEEE Computer graphics and applications, vol. 21, no. 5, pp. 34-41, 2001.
[18]Z. Li, Z. Jing, X. Yang, and S. Sun, Color transfer based remote sensing image fusion using non-separable wavelet frame transform, Pattern Recognition Letters, vol. 26, no. 13, pp. 2006-2014, 2005/10/01/ 2005, doi: https://doi.org/10.1016/j.patrec.2005.02.010.
[19]L. Vincent and P. Soille, Watersheds in digital spaces: an efficient algorithm based on immersion simulations, IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 6, pp. 583-598, 1991.
[20] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, You only look once: Unified, real-time object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788.
[21] W. Liu et al., Ssd: Single shot multibox detector, in European conference on computer vision, 2016: Springer, pp. 21-37.
[22] O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, in International Conference on Medical image computing and computer-assisted intervention, 2015: Springer, pp. 234-241.
[23]Google, Speech-to-Text: Automatic Speech Recognition | Cloud Speech-to-Text. [Online]. Available: https://cloud.google.com/speech-to-text.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔