|
[1]“A 2019 guide to Human Pose Estimation with Deep Learning,” The Nanonets Blog, 12-Apr-2019. [Online]. Available: https://nanonets.com/blog/human-pose-estimation-2d-guide/. [Accessed: 26-Jul-2019]. [2]H. Xu, A. Das, and K. Saenko, “R-C3D: Region Convolutional 3D Network for Temporal Activity Detection,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 5794–5803. [3]K. Simonyan and A. Zisserman, “Two-Stream Convolutional Networks for Action Recognition in Videos,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 568–576. [4]Joe Yue-Hei Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici, “Beyond short snippets: Deep networks for video classification,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 4694–4702. [5]A. Diba, V. Sharma, and L. V. Gool, “Deep Temporal Linear Encoding Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1541–1550. [6]L. Wang et al., “Temporal Segment Networks: Towards Good Practices for Deep Action Recognition,” in Computer Vision – ECCV 2016, vol. 9912, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 20–36. [7]H. Wang and L. Wang, “Beyond Joints: Learning Representations From Primitive Geometries for Skeleton-Based Action Recognition and Detection,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4382–4394, Sep. 2018. [8]C. Li, Q. Zhong, D. Xie, and S. Pu, “Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 786–792. [9]S. Yan, Y. Xiong, and D. Lin, “Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition,” arXiv:1801.07455 [cs], Jan. 2018. [10]J. Martinez, M. J. Black, and J. Romero, “On Human Motion Prediction Using Recurrent Neural Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 4674–4683. [11]Y. Tang, L. Ma, W. Liu, and W.-S. Zheng, “Long-Term Human Motion Prediction by Modeling Motion Context and Enhancing Motion Dynamics,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 935–941. [12]K. Fragkiadaki, S. Levine, P. Felsen, and J. Malik, “Recurrent Network Models for Human Dynamics,” in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 4346–4354. [13]A. Jain, A. R. Zamir, S. Savarese, and A. Saxena, “Structural-RNN: Deep Learning on Spatio-Temporal Graphs,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 5308–5317. [14]“NNDesign.pdf.” . [15]“File:Artificial neural network.svg,” 維基百科,自由的百科全書. . [16]Cecbur, English: 3 filters (=kernels, neurons) in the first layer of a convolutional artificial neural network interpreting an image. (A real network has many more.). 2019. [17]M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, Nov. 1997. [18]“Understanding LSTM Networks -- colah’s blog.” [Online]. Available: http://colah.github.io/posts/2015-08-Understanding-LSTMs/. [Accessed: 26-Jul-2019]. [19]“2604.pdf.” . [20]K. Cho et al., “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1724–1734. [21]M. Mohammadi, R. Mundra, and R. Socher, “CS 224D: Deep Learning for NLP,” p. 12. [22]J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling,” arXiv:1412.3555 [cs], Dec. 2014. [23]B. Raj, “An Overview of Human Pose Estimation with Deep Learning,” Medium, 01-May-2019. [Online]. Available: https://medium.com/beyondminds/an-overview-of-human-pose-estimation-with-deep-learning-d49eb656739b. [Accessed: 26-Jul-2019]. [24]S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, “Convolutional Pose Machines,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4724–4732. [25]A. Newell, K. Yang, and J. Deng, “Stacked Hourglass Networks for Human Pose Estimation,” arXiv:1603.06937 [cs], Mar. 2016. [26]G. Papandreou et al., “Towards Accurate Multi-person Pose Estimation in the Wild,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3711–3719. [27]S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, Jun. 2017. [28]K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980–2988. [29]H. Fang, S. Xie, Y. Tai, and C. Lu, “RMPE: Regional Multi-person Pose Estimation,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2353–2362. [30]Z. Cao, T. Simon, S. Wei, and Y. Sheikh, “Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1302–1310. [31]C. Li, Z. Zhang, W. S. Lee, and G. H. Lee, “Convolutional Sequence to Sequence Model for Human Dynamics,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5226–5234. [32]S. Toyer, A. Cherian, T. Han, and S. Gould, “Human Pose Forecasting via Deep Markov Models,” in 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2017, pp. 1–8. [33]J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv:1804.02767 [cs], Apr. 2018. [34]W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang, “Learning Feature Pyramids for Human Pose Estimation,” arXiv:1708.01101 [cs], Aug. 2017. [35]M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, “Spatial Transformer Networks,” arXiv:1506.02025 [cs], Jun. 2015. [36]“The Unreasonable Effectiveness of Recurrent Neural Networks.” [Online]. Available: http://karpathy.github.io/2015/05/21/rnn-effectiveness/. [Accessed: 26-Jul-2019]. [37]B. Fortuner, “Intro to Threads and Processes in Python,” Medium, 07-Sep-2017. [Online]. Available: https://medium.com/@bfortuner/python-multithreading-vs-multiprocessing-73072ce5600b. [Accessed: 26-Jul-2019]. [38]“Introduction.” [Online]. Available: http://docs.nvidia.com/deploy/mps/index.html. [Accessed: 26-Jul-2019]. [39]“COCO - Common Objects in Context.” [Online]. Available: http://cocodataset.org/#keypoints-2018. [Accessed: 26-Jul-2019]. [40]OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation: CMU-Perceptual-Computing-Lab/openpose. CMU-Perceptual-Computing-Lab, 2019. [41]A. Graves, “Generating Sequences With Recurrent Neural Networks,” arXiv:1308.0850 [cs], Aug. 2013. [42]A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang, “NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 1010–1019. [43]“Rapid-Rich Object Search (ROSE) Lab.” [Online]. Available: http://rose1.ntu.edu.sg/datasets/actionrecognition.asp. [Accessed: 26-Jul-2019]. [44]“MPII Human Pose Database.” [Online]. Available: http://human-pose.mpi-inf.mpg.de/. [Accessed: 26-Jul-2019]. [45]“FLIC dataset.” [Online]. Available: https://bensapp.github.io/flic-dataset.html. [Accessed: 26-Jul-2019].
|