跳到主要內容

臺灣博碩士論文加值系統

(44.212.96.86) 您好!臺灣時間:2023/12/06 15:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:楊世遠
研究生(外文):YANG, SHI-YUAN
論文名稱:植基於空洞沙漏網路之人體姿態估測
論文名稱(外文):Human Pose Estimation Based on Dilated Hourglass
指導教授:賴智錦賴智錦引用關係
指導教授(外文):LAI, CHIH-CHIN
口試委員:葉瑞峰潘欣泰賴智錦
口試委員(外文):YEH, JUI-FENGPAN, SHING-TAILAI, CHIH-CHIN
口試日期:2022-07-27
學位類別:碩士
校院名稱:國立高雄大學
系所名稱:電機工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:46
中文關鍵詞:人體姿態估測卷積神經網路空洞沙漏網路
外文關鍵詞:human pose estimationconvolutional neural networksdilated hourglass networks
相關次數:
  • 被引用被引用:0
  • 點閱點閱:118
  • 評分評分:
  • 下載下載:15
  • 收藏至我的研究室書目清單書目收藏:0
人體姿態估測是指從影像中估測出人體關節位置資訊,以及關節點之間的關聯性進而建構人體骨架。它能應用在許多不同領域上,因此人體姿態估測一直是計算機視覺領域的熱門研究主題。在本論文中,我們提出一個植基於空洞沙漏網路的方法,此網路架構具有多分辨率的影像資訊與多尺度的影像特徵,能為人體姿態估測帶來優異的表現。從Leeds Sports Pose與MPII影像資料庫的實驗結果顯示,我們所提的方法在人體姿態估測上有著不錯的效能。
Human pose estimation refers to the estimation of human joint position information from images, as well as the correlation between joint nodes, and the construction of human skeleton. It can be applied in many different fields, so human pose estimation has always been a popular research topic in the field of computer vision. In this paper, we propose a method for human pose estimation based on dilated hourglass networks. This network architecture has multi-resolution image information and multi-scale image features, which can achieve good performance to human pose estimation. Experimental results on Leeds Sports Pose dataset and MPII dataset are given to illustrate the feasibility of the proposed approach for human pose estimation.
摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
圖目錄 v
表目錄 vi
第一章 緒論 1
1.1研究動機與目的 1
1.2研究方法與架構 1
第二章 文獻探討 3
2.1基於影像結構之人體姿態估測 3
2.2基於深度學習之人體姿態估測 4
2.3基於沙漏網路之人體姿態估測 7
第三章 研究方法 9
3.1人體姿態估測系統 9
3.1.1人體關節點真實標記 10
3.1.2特徵圖與熱力圖 11
3.2沙漏網路 12
3.3空洞卷積堆疊沙漏神經網路 14
3.3.1空洞卷積 16
3.3.2堆疊沙漏網路 17
3.4損失函數 18
第四章 實驗結果 20
4.1實驗環境 20
4.2人體姿態影像資料庫 21
4.3實驗結果與分析 23
4.3.1實驗一 24
4.3.2實驗二 26
4.3.3實驗三 29
第五章 結論與未來工作 32
參考文獻 34

[1]J. Stenum, M. K. Cherry Allen, C. O. Pyles, R. D. Reetzke, M. F. Vignos, and R. T. Roemmich, “Applications of pose estimation in human health and performance across the lifespan,” Sensors, no. 21, pp. 1-20, 2021.
[2]K. Ludwig, S. Scherer, M. Einfalt, and R. Linenhart, “Self-supervised learning for human pose estimation in sports,” in Proc. IEEE International Conference on Multimedia & Expo Workshops, 2021, pp. 1-6.
[3]F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” in Proc. International Conference on Learning Representations, 2016, pp. 1-13.
[4]S. Johnson and M. Everingham, “Clustered pose and nonlinear appearance models for human pose estimation” in Proc. 21st British Machine Vision Conference, 2010, pp. 12.1-12.11.
[5]M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D human pose estimation: New benchmark and state of the art analysis,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3686-3693.
[6]M. Andriluka, S. Roth, and B. Schiele, “Pictorial structures revisited: people detection and articulated pose estimation,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 1014-1021.
[7]Y. Yang and D. Ramanan, “Articulated pose estimation with flexible mixtures-of-parts” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 1385-1392.
[8]L. Pishchulin, M. Andriluka, P. Gehler, and B. Schiele, “Poselet conditioned pictorial structures,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 588-595.
[9]M. Dantone, J. Gall, C. Leistner, and L. Cool, “Human pose estimation using body parts dependent joint regressors,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3041-3048.
[10]F. Achilles, A. Ichim, H. Coskun, F. Tombari, F. Noachtar, and N. Navab, “Patient MoCap: human pose estimation under blanket occlusion for hospital monitoring applications,” in Proc. International Conference on Medical Image Computing and Computer-Assisted Intervention, 2016, pp. 491-499.
[11]T. Nageli, S. Oberholzer, S. Pluss, J. Alonso, and O. Hilliges, “Flycon: Real-time environment-independent multi-view human pose estimation with aerial vehicles,” ACM Transactions on Graphics, vol. 37, pp. 1-14, 2018.
[12]A. Toshev and C. Szegedy, “Deeppose: Human pose estimation via deep neural networks,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1653-1660.
[13]A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1-9, 2012.
[14]J. Tomposn, A. Jain, Y. LeCun, and C. Bregler, “Joint training of a convolutional network and a graphical model for human pose estimation,” in Proc. The 27th International Conference on Neural Advances Information Processing Systems, 2014, vol. 1, pp. 1799-1807
[15]X. Fan, K. Zheng, Y. Lin, and S. Wang, “Combining local appearance and holistic view: Dual-source deep neural networks for human pose estimation,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1347-1355.
[16]S. E. Wei, V. Ramakishna, T. Kanade, and Y. Sheikh, “Convolutional pose machines,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4724-4732.
[17]A. Bulat and G. Tzimiropoulos, “Human pose estimation via convolutional part heatmap regression,” in Proc. European Conference on Computer Vision, 2016, pp. 717-732.
[18]N. Zhang, E. Shelhamer, Y. Gao, and T. Darrell, “Fine-grained pose prediction, normalization, and recognition,” arXiv:1511.07063v1, 2015, pp. 1-8.
[19]I. Lifshitz, E. Fetaya, and S. Ullman, “Human pose estimation using deep consensus voting,” in Proc. European Conference on Computer Vision, 2016, pp. 246-260.
[20]Y. Chan, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, “Cascaded pyramid network for multi-person pose estimation,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7103-7112.
[21]A. Martinez, M. Villamizar, O. Canevet, and M. J. Odobez, “Real-time convolutional networks for depth-based human pose estimation,” in Proc. IEEE Conference on Intelligent Robots and Systems, 2018, pp. 41-47.
[22]D. Luo, S. Du, and T. Ikenaga, “End-to-end feature pyramid network for real-time multi-person pose estimation,” in Proc. IEEE Conference on Machine Vision Applications, 2019, pp. 1-4.
[23]S. Jin, L. Xu, J. Xu, C. Wang, W. Lin, C. Qine, and P. Luo, “Whole-body human pose estimation in the wild,” in Proc. European Conference on Computer Vision, 2020, pp. 196-214.
[24]S. Liang, G. Chu, C. Xin, and J. Wang, “Joint relation based human pose estimation,” The Visual Computer, vol.38, no. 4, pp. 1369-1381, 2022.
[25]A. Newell, K. Yang, and J. Deng, “Stacked hourglass networks for human pose estimation,” in Proc. European Conference on Computer Vision, 2016, pp. 483-499.
[26]X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille, and X. Wang, “Multi-context attention for human pose estimation,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1831-1840.
[27]W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang, “Learning feature pyramids for human pose estimation,” in Proc. IEEE International Conference on Computer Vision, 2017, pp. 1281-1290.
[28]L. Ke, M. C. Chang, H. Qi, and S. Lyu, “Multi-scale structure-aware network for human pose estimation,” in Proc. European Conference on Computer Vision, 2018, pp. 713-728.
[29]Z. Cao, R. Wang, X. Wang, Z. Liu, and X. Zhu, “Improving human pose estimation with self-attention generative adversarial networks,” in Proc. IEEE International Conference on Multimedia & Expo Workshops, 2019, pp. 567-572.
[30]I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde, S. Ozair, and Y. Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems, vol. 27, pp. 1-9, 2014.
[31]S. T. Kim and H. J. Lee, “Lightweight stacked hourglass network for human pose
estimation,” Applied Sciences, 2020, 10(18):6497.
[32]G. Ning, Z. Zhang, and Z. He, “Knowledge-guided deep fractal neural networks for human pose estimation,” IEEE Transactions on Multimedia, vol. 20, no. 5, pp. 1246-1259, 2017.
[33]P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, “Understanding convolution for semantic segmentation,” in Proc. IEEE Winter Conference on Applications of Computer Vision, 2018, pp. 1451-1460.
[34]V. Belagiannis and A. Zisserman, “Recurrent human pose estimation,” in Proc. IEEE International Conference on Automatic Face & Gesture Recognition, 2017, pp. 468-475.
[35]L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka, V. Gehler, and B. Schiele, “Deepcut: Joint subset partition and labeling for multi person pose estimation,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4929-4937.
[36]E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, “Deepercut: A deeper, stronger, and faster multi-person pose estimation model,” in Proc. European Conference on Computer Vision, 2016, pp. 34-50.
[37]W. Tang, P. Yu, and Y. Wu, “Deeply learned compositional models for human pose estimation,” in Proc. European Conference on Computer Vision, 2018, pp. 190-206.
[38]Z. Huo, H. Jin, Y. Qiao, and F. Luo, “Deep high-resolution network with double attention residual blocks for human pose estimation,” IEEE Access, vol. 8, pp. 224947-224957, 2020.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top