臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.176) 您好！臺灣時間：2025/09/09 07:44

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

葉旭 (村永旭)

研究生(外文):

XU YE (AKIRA MURANAGA)

論文名稱:

雙向Transformers於骨架動作預測之應用

論文名稱(外文):

On Human Motion Prediction Using Bidirectional Encoder Representations from Transformers

指導教授:

方文賢

指導教授(外文):

Wen-Hsien Fang

口試委員:

陳郁堂、賴坤財、丘建青、鍾聖倫

口試委員(外文):

Yie-Tarng Chen、Kuen-Tsair Lay、Chien-ching Chiu、Sheng-Luen Chung

口試日期:

2019-07-31

學位類別:

碩士

校院名稱:

國立臺灣科技大學

系所名稱:

電子工程系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2019

畢業學年度:

107

語文別:

英文

論文頁數:

中文關鍵詞:

注意力機制、骨架動作預測

外文關鍵詞:

transformer、human motion prediction

相關次數:

被引用:0
點閱:229
評分:
下載:0
書目收藏:0

Pose prediction found applications in a variety of areas.
However, current methods adopting recurrent neural networks suffer from error accumulation in the training stage. Furthermore, encoder-decoder architecture in general fails to predict continuous poses between the end of the encoder input and the beginning of the decoder output.
Benefiting from the recent successes of the attention mechanism, in the thesis, we propose a novel method which combined the transformer encoder architecture and universal transformer.
The new architecture is free of error accumulation because this architecture processes data parallelly and the weight of updating for each position is equal. Moreover, the proposed attention map helps attention mechanism to refrain the predicted poses from discontinuity.
We also apply adaptive computation time algorithm to optimize the iteration numbers of performing an attention mechanism.
The mean absolute loss is considered to handle human motion prediction problem in the training process on the Human3.6M dataset.
Simulations show that the proposed method outperforms the main state-of-the-art approaches.

Table of contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Human Motion Prediction . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Modeling of Human Motion Prediction . . . . . . . . . . . . . . . 5
2.2 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Generative Adversarial Nets . . . . . . . . . . . . . . . . . . . . . 6
2.4 Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 Overall Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Data Pre-processing and Position Encoding . . . . . . . . . . . . 9
3.3 Transformer Encoder Stack . . . . . . . . . . . . . . . . . . . . . 11
iii
3.3.1 Scaled Dot-Product Attention . . . . . . . . . . . . . . . . 12
3.3.2 Multi-Head Attention . . . . . . . . . . . . . . . . . . . . . 16
3.3.3 Position-wise Feed-Forward Networks . . . . . . . . . . . . 17
3.4 Universal Transformers . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1 Evaluation Protocol and Experimental Setup . . . . . . . . . . . . 22
4.2 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Data Pre-processing . . . . . . . . . . . . . . . . . . . . . 24
4.2.2 Transformer Conguration . . . . . . . . . . . . . . . . . . 24
4.2.3 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Comparison With State-of-the-Art Methods . . . . . . . . . . . . 27
5 Conclusion and Future Works . . . . . . . . . . . . . . . . . . . . . . . 28
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Appendix A : Class-wise ablation studies . . . . . . . . . . . . . . . . . . . 29
Appendix B : Performance comparison of state-of-the-art method . . . . . 44
Appendix C : Visualization of attention distributions . . . . . . . . . . . . 52
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

References
[1] M. Brand and A. Hertzmann, \Style machines," in Proceedings of the 27th
annual conference on Computer graphics and interactive techniques, pp. 183{
192, ACM Press/Addison-Wesley Publishing Co., 2000.
[2] V. Pavlovic, J. M. Rehg, and J. MacCormick, \Learning switching linear
models of human motion," in Advances in neural information processing
systems, pp. 981{987, 2001.
[3] G. W. Taylor, G. E. Hinton, and S. T. Roweis, \Modeling human motion
using binary latent variables," in Advances in neural information processing
systems, pp. 1345{1352, 2007.
[4] K. Fragkiadaki, S. Levine, P. Felsen, and J. Malik, \Recurrent network models
for human dynamics," in Proceedings of the IEEE International Confer-
ence on Computer Vision, pp. 4346{4354, 2015.
[5] A. Jain, A. R. Zamir, S. Savarese, and A. Saxena, \Structural-rnn: Deep
learning on spatio-temporal graphs," in Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 5308{5317, 2016.
[6] P. Ghosh, J. Song, E. Aksan, and O. Hilliges, \Learning human motion
models for long-term predictions," in 2017 International Conference on 3D
Vision (3DV), pp. 458{466, IEEE, 2017.
[7] J. Martinez, M. J. Black, and J. Romero, \On human motion prediction
using recurrent neural networks," in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 2891{2900, 2017.
[8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,
L. Kaiser, and I. Polosukhin, \Attention is all you need," in Advances in
Neural Information Processing Systems, pp. 5998{6008, 2017.
60
[9] A. Gopalakrishnan, A. Mali, D. Kifer, L. Giles, and A. G. Ororbia, \A neural
temporal model for human motion prediction," in Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 12116{12125,
2019.
[10] J. N. Kundu, M. Gor, and R. V. Babu, \Bihmp-gan: Bidirectional 3d human
motion prediction gan," arXiv preprint arXiv:1812.02591, 2018.
[11] J. Butepage, H. Kjellstrom, and D. Kragic, \Classify, predict, detect, anticipate
and synthesize: Hierarchical recurrent latent variable models for human
activity modeling," CoRR, vol. abs/1809.08875, 2018.
[12] M. Wolter and A. Yao, \Gated complex recurrent neural networks," CoRR,
vol. abs/1806.08267, 2018.
[13] Y. T. Xu, Y. Li, and D. Meger, \Human motion prediction via pattern
completion in latent representation space," arXiv preprint arXiv:1904.09039,
2019.
[14] H. Wang and J. Feng, \Vred: A position-velocity recurrent encoder-decoder
for human motion prediction," arXiv preprint arXiv:1906.06514, 2019.
[15] H.-k. Chiu, E. Adeli, B. Wang, D.-A. Huang, and J. C. Niebles, \Actionagnostic
human pose forecasting," in 2019 IEEE Winter Conference on Ap-
plications of Computer Vision (WACV), pp. 1423{1432, IEEE, 2019.
[16] Z. Liu, S. Wu, S. Jin, Q. Liu, S. Lu, R. Zimmermann, and L. Cheng, \Towards
natural and accurate future motion prediction of humans and animals,"
in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pp. 10004{10012, 2019.
[17] Y. Li, Z. Wang, X. Yang, M. Wang, S. I. Poiana, E. Chaudhry, and J. Zhang,
\Ecient convolutional hierarchical autoencoder for human motion prediction,"
The Visual Computer, vol. 35, no. 6-8, pp. 1143{1156, 2019.
61
[18] L.-Y. Gui, Y.-X. Wang, X. Liang, and J. M. Moura, \Adversarial geometryaware
human motion prediction," in Proceedings of the European Conference
on Computer Vision (ECCV), pp. 786{803, 2018.
[19] D. Pavllo, C. Feichtenhofer, M. Auli, and D. Grangier, \Modeling human
motion with quaternion-based neural networks," arXiv preprint
arXiv:1901.07677, 2019.
[20] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, \Generative adversarial nets," in
Advances in neural information processing systems, pp. 2672{2680, 2014.
[21] E. Barsoum, J. Kender, and Z. Liu, \Hp-gan: Probabilistic 3d human motion
prediction via gan," arXiv preprint arXiv:1711.09561, 2017.
[22] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, \Bert: Pre-training of
deep bidirectional transformers for language understanding," arXiv preprint
arXiv:1810.04805, 2018.
[23] M. Dehghani, S. Gouws, O. Vinyals, J. Uszkoreit, and L. Kaiser, \Universal
transformers," arXiv preprint arXiv:1807.03819, 2018.
[24] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin, et al., \Tensor
ow: Large-scale
machine learning on heterogeneous distributed systems," arXiv preprint
arXiv:1603.04467, 2016.
[25] A. Vaswani, S. Bengio, E. Brevdo, F. Chollet, A. N. Gomez, S. Gouws,
L. Jones, L. Kaiser, N. Kalchbrenner, N. Parmar, R. Sepassi, N. Shazeer,
and J. Uszkoreit, \Tensor2tensor for neural machine translation," CoRR,
vol. abs/1803.07416, 2018.
[26] C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, \Human3. 6m: Large
scale datasets and predictive methods for 3d human sensing in natural envi-
62
ronments," IEEE transactions on pattern analysis and machine intelligence,
vol. 36, no. 7, pp. 1325{1339, 2014.
[27] J. Duchi, E. Hazan, and Y. Singer, \Adaptive subgradient methods for online
learning and stochastic optimization," Journal of Machine Learning Re-
search, vol. 12, no. Jul, pp. 2121{2159, 2011.
[28] T. Tieleman and G. Hinton, \Lecture 6.5-rmsprop, coursera: Neural networks
for machine learning," University of Toronto, Technical Report, 2012.

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	應用Transformer Encoder於車禍偵測之研究
2.	智慧校園概念下訊息服務應用之研究-以校園美食商場為例
3.	靜電紡絲製備聚偏氟乙烯奈米纖維膜於結晶型態、壓電及熱電特性之研究
4.	以人工智能分析現場數據進行地下開挖過程之地質研判及其應用
5.	A Novel Approach to Estimate Land Subsidence Induced by Excessive Groundwater Withdrawal
6.	Investigating the Applicability of Underground Excavation Indices to the Prediction of Tunneling-Induced Surface Settlement Using Artificial Intelligence
7.	具可調整徑向形狀及微奈米線狀結構之神經引導導管
8.	具低溫敏之金氧半導體弛張振盪器設計
9.	使用低成本立體相機建立工業機器人立體視覺與應用於散裝零件夾取之研究
10.	應用銅鏽於生活產品之設計創作
11.	電泳沉積奈米銀線及聚二氧乙基噻吩聚苯乙烯磺酸透明導電層應用於電致發光之電極研究
12.	Optimized Management Strategy for Construction Projects Considering the Trade-off of Estimate Schedule and Cost at Completion
13.	RC構架含切槽縫牆之耐震行為試驗研究
14.	強震預警技術應用於半主動斜面式滾動支承之研究
15.	應用修正式TRIZ於空氣預熱器的創新設計研究

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室