( 您好!臺灣時間:2021/03/01 15:39
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):WU, KUAN-HUI
論文名稱(外文):A Study of a Deep Hybrid Model with Loss Function Adjustment in Imbalanced Class Activity Recognition
指導教授(外文):Alan Liu
口試委員(外文):Lee, Yun-ZhongChiu, Chih-Yi
外文關鍵詞:human activity recognitiondeep learninghybrid modelsclass imbalanceloss function
  • 被引用被引用:0
  • 點閱點閱:45
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來隨著深度學習的成熟,智慧家庭中基於感測器的人類活動辨識也越來越受到重視。但此項研究中也存在著許多挑戰,類別不平衡即為其中面臨到的問題之一。本論文基於深度學習模型處理在類別不平衡下的人類活動辨識,模型利用卷積神經網路與遞迴神經網路結合的混合式模型作為研究基礎,其中以卷積神經網路擷取空間特徵,遞迴神經網路擷取時間特徵。並於研究中改善兩部份在特徵擷取上的能力,以inception module增加多尺度的特徵與擴張卷積增加卷積視野的優點,改善空間特徵擷取上的能力。另外加入注意機制使模型關注在重要資訊上,並以雙向LSTM (Long Short-Term Memory, LSTM) 增加反方向序列特徵改善時間特徵擷取上的能力。最後,基於改善後的模型使用focal loss調整損失函數的方式使模型得以關注於少數類別上,進而改善類別不平衡問題。本研究於實驗分別驗證了模型於各部份改善後的分類能力,並證實了模型本身在面對類別不平衡情況下改善的分類能力,以F1-score評估方法驗證了本論文之架構在分類能力上的提升。
In recent years, with the maturity of deep learning. Sensor-based human activity recognition in smart homes has received more attention. But there are also many challenges in this research, and class imbalance is one of the problems faced. This study focuses on the deep learning model to process human activity recognition under imbalanced classes. The model uses a hybrid model combining convolutional neural networks and recurrent neural networks as a research basis. While a convolutional neural network extracts spatial features, a recurrent neural network extracts time features to improve the capabilities of the two parts in feature extraction in the research. Furthermore, the inception module is used to increase multi-scale features and dilated convolution to utilize the advantages of convolutional field of view to improve spatial feature extraction. In addition, the attention mechanism is added to make the model focus on important information. and the bidirectional LSTM (Long Short-Term Memory) is used to increase the reverse direction sequence feature to improve the ability of time feature extraction. Finally, based on the improved model, the proposed method uses the focal loss to adjust the loss function to make the model able to focus on minority class to process the class imbalance problem. In the experiment, we verified the classification ability in each part of the improved model and confirmed the improved classification ability of the model itself in the face of imbalanced classes. Moreover, we use the F1-score evaluation method to verify the improvement of the classification ability of the proposed method.
誌謝 i
摘要 ii
Abstract iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 1
1.3 論文架構 2
第二章 文獻探討 3
2.1 活動辨識介紹及挑戰 3
2.2 活動辨識鏈介紹 5
2.2.1 感測器資料獲取和前處理 5
2.2.2 資料分段 6
2.2.3 特徵擷取與選擇 7
2.2.4 類別分類 8
2.3 深度學習方法探討 9
2.3.1 Convolutional neural networks 9
2.3.2 Recurrent neural networks 11
2.3.3 Hybrid models 13
2.3.4 模型方法比較 15
2.4 解決類別不平衡之方法 17
2.4.1 Re-sampling 17
2.4.2 Re-weighting 19
2.4.3方法比較 19
2.5 小結 20
第三章 研究方法 22
3.1 研究架構 22
3.2 空間特徵擷取 24
3.3 時間特徵擷取 27
3.4 損失函數調整 29
3.5 討論 31
第四章 實驗結果 32
4.1 訓練資料與環境 32
4.2 評估方法 35
4.3 實驗設計與結果 36
4.3.1 實驗一:時間特徵擷取改善 37
4.3.2 實驗二:類別不平衡改善 39
4.3.3 實驗三:空間特徵擷取改善 40
4.3.4 實驗四:本研究架構方法 41
4.3.5 討論 43
第五章 結論與未來展望 45
參考文獻 47
附錄一 混淆矩陣圖 52
附錄二 本研究架構超參數設置 54
附錄三 注意機制程式碼說明 55
[1] G. Singla, D. J. Cook, and M. Schmitter-Edgecombe, “Recognizing independent and joint activities among multiple residents in smart environments,” Journal of ambient intelligence and humanized computing, vol. 1, no. 1, pp. 57-63, 2010.
[2] L. Chen and C. Nugent, “Ontology-based activity recognition in intelligent pervasive environments,” International Journal of Web Information Systems, 2009.
[3] D. J. Cook and M. Schmitter-Edgecombe, “Assessing the quality of activities in a smart environment,” Methods of information in medicine, vol. 48, no. 05, pp. 480-485, 2009.
[4] P. Bilinski, E. Corvee, S. Bak, and F. Bremond, “Relative dense tracklets for human action recognition,” 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–7, 2013.
[5] D. Cook, K. D. Feuz, and N. C. Krishnan, “Transfer learning for activity recognition: A survey,” Knowledge and information systems, vol. 36, no. 3, pp.537-556, 2013.
[6] C. Cedras and M. Shah, “Motion-based recognition a survey,” Image and Vision Computing, vol. 13, no. 2, pp.129-155, 1995.
[7] D. M. Gavrila, “The visual analysis of human movement: A survey,” Computer vision and image understanding, vol. 73, no. 1, pp.82-98, 1999.
[8] A. Yilmaz, O. Javed, and M. Shah, “Object tracking: A survey,” Acm computing surveys (CSUR), vol. 38, no. 4, pp. 13-es, 2006.
[9] M. Abu Alsheikh et al., “Deep activity recognition models with triaxial accelerometers,”Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, March, 2016.
[10] N. Twomey et al., “A comprehensive study of activity recognition using accelerometers,” in Informatics, vol. 5, no. 2, pp. 27, June, 2018.
[11] A. Bulling, U. Blanke, and B. Schiele, “A tutorial on human activity recognition using body-worn inertial sensors,” ACM Computing Surveys (CSUR), vol. 46, no. 3, pp.1-33, 2014.
[12] T. Huynh, M. Fritz, and B. Schiele, “Discovery of activity patterns using topic models,” in Proceedings of the 10th international conference on Ubiquitous computing, pp. 10–19, September, 2008.
[13] F. Li et al., “Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors,” Sensors, vol. 18, no. 3, p. 679, Feburary, 2018.
[14] T. Huynh and B. Schiele, “Analyzing features for activity recognition,” Proceedings of the 2005 joint conference on Smart objects and ambient intelligence: innovative context-aware services: usages and technologies, pp. 159–163, 2005.
[15] E. Guenterberg, S. Ostadabbas, H. Ghasemzadeh, and R. Jafari, “An automatic segmentation technique in body sensor networks based on signal energy,” Proceedings of the Fourth International Conference on Body Area Networks, pp. 1–7, April, 2009.
[16] O. Amft, H. Junker, and G. Troster, “Detection of eating and drinking arm gestures using inertial body-worn sensors,” Ninth IEEE International Symposium on Wearable Computers (ISWC’05), pp. 160–163, October, 2005.
[17] A. Zinnen, C. Wojek, and B. Schiele, “Multi activity recognition based on bodymodel-derived primitives,” in International Symposium on Location-and Context-Awareness, pp. 1–18, May, 2009.
[18] D. Ashbrook and T. Starner, “Using GPS to learn significant locations and predict movement across multiple users,” Personal and Ubiquitous computing, vol. 7, no. 5, pp 275-286, 2003.
[19] H. Li, G. D. Abowd, and T. Plötz, “On specialized window lengths and detector based human activity recognition,” Proceedings of the 2018 ACM International Symposium on Wearable Computers, pp. 68–71,October, 2018.
[20] A. Ferrari, D. Micucci, M. Mobilio, and P. Napoletano, “Hand-crafted Features vs Residual Networks for Human Activities Recognition using Accelerometer,” in 2019 IEEE 23rd International Symposium on Consumer Technologies (ISCT), pp. 153–156, June, 2019.
[21] M. Zhang and A. A. Sawchuk, “Motion primitive-based human activity recognition using a bag-of-features approach,” Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 631–640, January, 2012.
[22] R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1–2, pp. 273–324, Dec. 1997.
[23] Hanchuan Peng, Fuhui Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. Pattern Anal. Machine Intell., vol. 27, no. 8, pp. 1226–1238, Aug. 2005.
[24] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
[25] R. Memisevic, C. Zach, M. Pollefeys, and G. E. Hinton, “Gated softmax classification,” in Advances in neural information processing systems, pp. 1603–1611, 2010.
[26] A. Y. Ng and M. I. Jordan, “On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes,” Advances in neural information processing systems, pp. 841–848, 2002.
[27] M. Zeng et al., “Convolutional Neural Networks for Human Activity Recognition using Mobile Sensors,” In: 6th International Conference on Mobile Computing, Applications and Services. IEEE, pp. 197-205, 2014.
[28] N. Y. Hammerla, S. Halloran, and T. Ploetz, “Deep, convolutional, and recurrent models for human activity recognition using wearables,'' Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 1533-1540, July, 2016.
[29] S. Ha, J.-M. Yun, and S. Choi, “Multi-modal Convolutional Neural Networks for Activity Recognition,” 2015 IEEE International Conference on Systems, Man, and Cybernetics, pp. 3017–3022, October, 2015.
[30] W. Jiang and Z. Yin, “Human activity recognition using wearable sensors by deep convolutional neural networks,” in Proceedings of the 23rd ACM international conference on Multimedia, pp. 1307–1310, October, 2015.
[31] C. A. Ronao and S.-B. Cho, “Human activity recognition with smartphone sensors using deep learning neural networks,” Expert Systems with Applications, vol. 59, pp. 235–244, October, 2016.
[32] F. Ordóñez and D. Roggen, “Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition,” Sensors, vol. 16, no. 1, pp. 1-25, January. 2016.
[33] L. Xue et al., “Understanding and improving deep neural network for activity recognition,” arXiv preprint arXiv:1805.07020, 2018.
[34] Song-Mi Lee, Sang Min Yoon, and Heeryon Cho, “Human activity recognition from accelerometer data using Convolutional Neural Network,” 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 131–134, Feburary, 2017.
[35] A. Ignatov, “Real-time human activity recognition from accelerometer data using Convolutional Neural Networks,” Applied Soft Computing, vol. 62, pp. 915–922, Janurary, 2018.
[36] M. Edel and E. Koppe, “Binarized-BLSTM-RNN based Human Activity Recognition,” in 2016 International Conference on Indoor Positioning and Indoor Navigation (IPIN), pp. 1–7, October, 2016.
[37] A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, vol. 18, no. 5–6, pp. 602–610, July, 2005.
[38] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673–2681, November. 1997.
[39] Kolen, John F., and Stefan C. Kremer, “Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies,” A Field Guide to Dynamical Recurrent Networks, pp. 237-243, 2001.
[40] N. Tax, “Human Activity Prediction in Smart Home Environments with LSTM Neural Networks,” in 2018 14th International Conference on Intelligent Environments (IE), pp. 40–47, June, 2018.
[41] M. Inoue, S. Inoue, and T. Nishida, “Deep Recurrent Neural Network for Mobile Human Activity Recognition with High Throughput,” Artificial Life and Robotics, Vol. 23, No.2, pp.173-185, 2018.
[42] A. Murad and J.-Y. Pyun, “Deep Recurrent Neural Networks for Human Activity Recognition,” Sensors, vol. 17, no. 11, p. 2556, Nov. 2017.
[43] H. F. Nweke, Y. W. Teh, M. A. Al-garadi, and U. R. Alo, “Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges,” Expert Systems with Applications, vol. 105, pp. 233–261, Sep. 2018
[44] V. S. Murahari and T. Plötz, “On attention models for human activity recognition,” Proceedings of the 2018 ACM International Symposium on Wearable Computers, pp. 100–103, October, 2018.
[45] C. Xu, D. Chai, J. He, X. Zhang, and S. Duan, “InnoHAR: A Deep Neural Network for Complex Human Activity Recognition,” IEEE Access, vol. 7, pp. 9893–9902, 2019.
[46] C. Szegedy et al., “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015.
[47] R. Xi et al., “Deep Dilation on Multimodality Time Series for Human Activity Recognition,” IEEE Access, vol. 6, pp. 53381–53396, 2018.
[48] M. Kubat and S. Matwin, “Addressing the curse of imbalanced training sets: one-sided selection,” Icml, vol. 97, pp. 179–186, July, 1997.
[49] J. Laurikkala, “Improving identification of difficult small classes by balancing class distribution,” Conference on Artificial Intelligence in Medicine in Europe, pp. 63–66, July, 2001.
[50] S.-J. Yen and Y.-S. Lee, “Cluster-based under-sampling approaches for imbalanced data distributions,” Expert Systems with Applications, vol. 36, no. 3, pp. 5718-5727, 2009.
[51] D. J. Drown, T. M. Khoshgoftaar, and R. Narayanan, “Using evolutionary sampling to mine imbalanced data,” in Sixth International Conference on Machine Learning and Applications (ICMLA 2007), pp. 363–368, December, 2007.
[52] J. Stefanowski, “Dealing with data difficulty factors while learning from imbalanced data,” in Challenges in Computational Statistics and Data Mining, pp. 333–363, 2016.
[53] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, vol. 16, pp. 321–357, 2002.
[54] H. Han, W.-Y. Wang, and B.-H. Mao, “Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning,” in International conference on intelligent computing, pp. 878–887, August, 2005.
[55] Z.-H. Zhou and X.-Y. Liu, “ON MULTI-CLASS COST-SENSITIVE LEARNING,” Computational Intelligence, vol. 26, no. 3, pp. 232–257, July, 2010.
[56] S. Wang, W. Liu, J. Wu, L. Cao, Q. Meng, and P. J. Kennedy, “Training deep neural networks on imbalanced data sets,” in 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, Jul. 2016, pp. 4368–4374
[57] T.-Y. L. P. G. Ross and G. K. H. P. Dollár, “Focal loss for dense object detection,” In Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
[58] F. Yu and V. Koltun, “Multi-Scale Context Aggregation by Dilated Convolutions,” arXiv:1511.07122 [cs], Apr. 2016, Accessed: Jul. 29, 2020.
[59] A. van den Oord et al., “Wavenet: A generative model for raw audio,” In: 9th ISCA Speech Synthesis Workshop. p. 125-125, 2016.
[60] P. Zhou et al., “Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, Germany, pp. 207–212, 2016.
[61] Ricardo Chavarriaga, Hesam Sagha, Alberto Calatroni, Sundaratejaswi Digumarti, Gerhard Tröster, José del R. Millán, Daniel Roggen. "The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition", Pattern Recognition Letters, 2013
[62] GÉRON, Aurélien. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O'Reilly Media, 2019.
[63] H. He, Y. Bai, E.A. Garcia and S. Li, "ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning", Proc. Int’l J. Conf. Neural Networks, pp. 1322-1328, 2008.
電子全文 電子全文(網際網路公開日期:20220831)
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔