跳到主要內容

臺灣博碩士論文加值系統

(44.222.64.76) 您好!臺灣時間:2024/06/14 08:57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:楊佩瑄
研究生(外文):YANG, PEI-HSUAN
論文名稱:基於單視角彩色影像序列輸入的三維人體骨架估測技術
論文名稱(外文):3D Human Skeleton Estimation Based on Monocular RGB Image Sequence
指導教授:賴文能賴文能引用關係
指導教授(外文):LIE, WEN-NUNG
口試委員:江瑞秋陳自強黃敬群
口試委員(外文):CHIANG,JUI-CHIUCHEN, TZU-CHIANGHUANG, CHING-CHUN
口試日期:2021-08-03
學位類別:碩士
校院名稱:國立中正大學
系所名稱:電機工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:62
中文關鍵詞:深度學習三維人體骨架估測圖卷積神經網路
外文關鍵詞:Deep Learning3D Human Pose EstimationGraph Convolutional Network
相關次數:
  • 被引用被引用:0
  • 點閱點閱:253
  • 評分評分:
  • 下載下載:37
  • 收藏至我的研究室書目清單書目收藏:0
隨著人工智慧的興起,人們對於機器的依賴性逐漸提高,與機器的互動也日漸密切,無論是監控系統、體感遊戲或是現今流行的科技健身指導等應用,都需要仰賴機器對人體姿態的理解,其中,三維人體骨架為一種被廣泛應用的人體姿態表徵,如何從影像或影片中提取出三維人體骨架是一項十分重要的任務。
本論文提出一種基於單視角彩色影像序列輸入的三維人體骨架估測技術,網路分為兩階段,第一階段先透過網路模型對單張影像個別估測二維骨架與其關節點相對深度資訊,再以序列的方式輸入至第二階段網路。第二階段網路主要採用圖卷積神經網路 (Graph Convolutional Network, GCN) 係將輸入的骨架序列以時間-空間圖形 (Spatial Temporal Graph, STG) 表示,並加入時間與空間的高階特徵,以多流 (Multi-Stream) 的方式,配合我們自定義的鄰接矩陣關係進行圖卷積運算,融合關節點座標及互相之連結資訊,迴歸出最終的三維人體骨架。我們針對第二階段模型設計了兩個前處理,目的是改善模型因為第一階段估測結果的誤差所導致第二階段模型錯誤擬合的情況。第一個前處理係在訓練時對輸入骨架的相對深度資訊加入高斯雜訊 (測試時則否)。第二種則是增加一個調整級,對輸入的 ( 2D+深度) 骨架進行高階時空特徵調整後再進行3D骨架迴歸。本論文的方法使用Human 3.6M數據集進行訓練與測試,實驗結果顯示,透過序列輸入的方式可以使模型獲得更多時間上的資訊,增強時間上的約束,抗雜訊的設計也可以有效的降低骨架估測的誤差值。我們的系統在整合9張輸入影像序列的條件下,對3D骨架關節點的平均關節點位置誤差 (Mean Per Joint Position Error, MPJPE) 可達到 48.22 mm。

摘要 i
圖目錄 v
表目錄 vii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 相關研究 2
1.2.1 二維人體骨架估測技術 2
1.2.2 單畫面輸入的三維人體骨架估測 5
1.2.3 序列輸入的三維人體骨架估測 6
1.2.4 基於圖卷積神經網路 (Graph Convolutional Network, GCN) 的人體骨架估測 10
1.3 本論文架構 13
第二章 二維骨架與相對深度之估測模型 15
2.1 數據集介紹 15
2.2 預訓練模型 17
第三章 基於骨架序列輸入之深度學習網路架構 19
3.1 基於人體骨架的時間-空間圖形 (STG) 表示 19
3.2 基於人體骨架的時間-空間圖卷積神經網路 (HSTGCN) 21
3.3 多流 (Multi-Stream) 架構 23
3.4 抗雜訊設計 26
3.4.1 變分編碼解碼器 (Variational Encoder-Decoder,VED) 26
3.4.2 降噪編碼解碼器 (Denoising Encoder-Decoder, DED) 27
3.5 損失函數設計 28
第四章 實驗結果與討論 30
4.1 實驗環境設置 30
4.2 訓練數據集與超參數設置 31
4.3 評估標準 32
4.4 實驗結果 32
4.4.1 變分編碼解碼器 (VED) 設計 32
4.4.2 降噪編碼解碼器 (DED) 設計 34
4.4.3 消融實驗 38
4.5 本論文與其他文獻方法之比較 39
4.6 數據集測試結果 42
第五章 結論與未來工作 46
5.1 結論 46
5.2 未來工作 46
參考文獻 47


[1]Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields,” IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), Vol.43, No.1, pp.172-186 2021.
[2]Alexander Toshev and Christian Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.1653-1660, 2014.
[3]Alejandro Newell, Kaiyu Yang, and Jia Deng, “Stacked Hourglass Networks for Human Pose Estimation,” Proc. of The European Conference on Computer Vision (ECCV), pp.483-499, 2016.
[4]Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun, “Cascaded Pyramid Network for Multi-Person Pose Estimation,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 7103-7112, 2018.
[5]Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang, “Deep High-Resolution Representation Learning for Human Pose Estimation,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.5693-5703, 2019.
[6]Sijin Li and Antoni B. Chan, “3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network,” Proc. The Asian Conference on Computer Vision (ACCV), pp.332-347, 2014.
[7]Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, and Kostas Daniilidis, “Coarse-To-Fine Volumetric Prediction for Single-Image 3D Human Pose,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.7025-7034, 2017.
[8]Julieta Martinez, Rayat Hossain, Javier Romero, and James J. Little, “A Simple yet Effective Baseline for 3D Human Pose Estimation,” Proc. of IEEE Int'l Conf. on Computer Vision (ICCV), pp.2640-2649, 2017.
[9]Bastian Wandt and Bodo Rosenhahn, “RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.7782-7791, 2019.
[10]施龍聖 “基於深度學習方法生成3D Heatmaps以迴歸三維人體骨架之技術” 中正大學電機工程研究所碩士論文, July. 2019.
[11]Naoki Kato, Hiroto Honda, and Yusuke Uchida, “Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video,” Proc. of IEEE Int'l Conf. on Consumer Electronics (GCCE), pp. 130-131, 2020.
[12]Mir Rayat Imtiaz Hossain, and James J. Little, “Exploiting Temporal Information for 3D Human Pose Estimation,” Proc. of The European Conference on Computer Vision (ECCV), pp.69-86, 2018.
[13]Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli, “3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.7745-7754, 2019.
[14]Jingwei Xu, Zhenbo Yu, Bingbing Ni, Jiancheng Yang, Xiaokang Yang, and Wenjun Zhang, “Deep Kinematics Analysis for Monocular 3D Human Pose Estimation,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.896-905, 2020.
[15]Yu Cheng, Bo Yang, Bo Wang, Yan Wending, and Robby Tan, “Occlusion-Aware Networks for 3D Human Pose Estimation in Video,” Proc. of IEEE Int'l Conf. on Computer Vision (ICCV), pp. 723-732, 2019.
[16]Yu Cheng, Bo Yang, Bo Wang, and Robby Tan, “3D Human Pose Estimation Using Spatio-Temporal Networks with Explicit Occlusion Training,” Proc. of The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.
[17]Tianlang Chen, Chen Fang, Xiaohui Shen, Yiheng Zhu, Zhili Chen, and Jiebo Luo, “Anatomy-aware 3D Human Pose Estimation in Videos,” IEEE Trans. on Circuits and Systems for Video Technology (TCSVT), early access, 2021.
[18]S. Yan, Y. Xiong and D. Lin, “Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition,” Proc. of Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), arXiv:1801.07455, 2018.
[19]Yujun Cai, Liuhao Ge, Jun Liu, Jianfei Cai, Tat-Jen Cham, Junsong Yuan, and Nadia Magnenat Thalmann, “Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks,” Proc. of IEEE Int'l Conf. on Computer Vision (ICCV), pp. 2272-2281, 2019.
[20]Jingbo Wang, Sijie Yan, Yuanjun Xiong, and Dahua Lin, “Motion Guided 3D Pose Estimation from Videos,” Proc. of The European Conference on Computer Vision (ECCV), pp.483-499, 2016.
[21]Zhiming Zou, Kenkun Liu, Le Wang, and Wei Tang, “High-order Graph Convolutional Networks for 3D Human Pose Estimation,” Proc. of The British Machine Vision Conference (BMVC), 2020.
[22]Tianhan Xu, and Wataru Takano, “Graph Stacked Hourglass Networks for 3D Human Pose Estimation,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.
[23]Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu, “Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments,” IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), Vol.36, No.7, pp.1325-1339, 2014.
[24]Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, and Kaiming He, “Aggregated Residual Transformations for Deep Neural Networks,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.1492-1500, 2017.
[25]Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep Residual Learning for Image Recognition,” Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition (CVPR), pp.770-778, 2016.
[26]Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing, “Skeleton-Based Action Recognition with Multi-Stream Adaptive Graph Convolutional Networks,” IEEE Trans. on Image Processing (TIP), Vol.29, pp.9532-9545, 2020.
[27]Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol, “Extracting and Composing Robust Features with Denoising Autoencoders,” Proc. of Int’l Conf. on Machine learning (ICML), pp.1096-1103, 2008.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊