|
[1] W. H. Organization. “Deafness and hearing loss — who.int.” (2024), [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss (visited on 05/18/2024). [2] D. Li, C. Rodriguez, X. Yu, and H. Li, “Word-level deep sign language recognition from video: A new large-scale dataset and methods comparison,” in The IEEE Winter Conference on Applications of Computer Vision, 2020, pp. 1459–1469. [3] 教育部國民及學前教育署. “學齡前2 至6 歲教保服務人員手語手冊,” [Online]. Available: https://www.ece.moe.edu.tw/ch/special_education/skill/skill_0002/ (visited on 06/11/2024). [4] 李信賢. “國際手語(is) 是否為一種語言?.” (2019), [Online]. Available: https : / / taslifamily.org/?p=4826 (visited on 05/18/2024). [5] E. Drasgow. “American sign language.” (2024), [Online]. Available: https : / / www . britannica.com/topic/American-Sign-Language (visited on 05/20/2024). [6] D. W. Vicars. “Gloss,” [Online]. Available: https://www.lifeprint.com/asl101/topics/ gloss.htm (visited on 05/20/2024). [7] 中華民國啟聰協會. “台灣手語介紹及手語qa,” [Online]. Available: https://www. deaf.org.tw/OnePage.aspx?mid=51&id=46 (visited on 05/20/2024). [8] SignTube, 台灣手語南北差異1 tsl dialects (1), YouTube, Accessed: 2024-06-02, 2023. [9] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016. [10] C. Lugaresi, J. Tang, H. Nash, et al., “Mediapipe: A framework for building perception pipelines,” arXiv preprint arXiv:1906.08172, 2019. [11] google-ai-edge. “Mediapipe holistic.” Accessed: 2024-06-02. (2022), [Online]. Available: https://github.com/google-ai-edge/mediapipe/blob/master/docs/solutions/holistic. md. [12] Z. Cao, G. Hidalgo, T. Simon, S. Wei, and Y. Sheikh, “Openpose: Realtime multi-person 2d pose estimation using part affinity fields,” CoRR, vol. abs/1812.08008, 2018. arXiv: 1812.08008. [Online]. Available: http://arxiv.org/abs/1812.08008. [13] T. Jiang, P. Lu, L. Zhang, et al., “Rtmpose: Real-time multi-person pose estimation based on mmpose,” arXiv preprint arXiv:2303.07399, 2023. [14] A. Sengupta, F. Jin, R. Zhang, and S. Cao, “Mm-pose: Real-time human skeletal posture estimation using mmwave radars and cnns,” IEEE Sensors Journal, vol. 20, no. 17, pp. 10 032–10 044, 2020. [15] C. Li, P. Wang, S. Wang, Y. Hou, and W. Li, “Skeleton-based action recognition using LSTM and CNN,” CoRR, vol. abs/1707.02356, 2017. arXiv: 1707.02356. [Online]. Available: http://arxiv.org/abs/1707.02356. [16] S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeletonbased action recognition,” CoRR, vol. abs/1801.07455, 2018. arXiv: 1801.07455. [Online]. Available: http://arxiv.org/abs/1801.07455. [17] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Adaptive spectral graph convolutional networks for skeleton-based action recognition,” CoRR, vol. abs/1805.07694, 2018. arXiv: 1805. 07694. [Online]. Available: http://arxiv.org/abs/1805.07694. [18] Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, and W. Hu, “Channel-wise topology refinement graph convolution for skeleton-based action recognition,” CoRR, vol. abs/2107.12213, 2021. arXiv: 2107.12213. [Online]. Available: https://arxiv.org/abs/2107.12213. [19] J. Carreira and A. Zisserman, “Quo vadis, action recognition? A new model and the kinetics dataset,” CoRR, vol. abs/1705.07750, 2017. arXiv: 1705.07750. [Online]. Available: http://arxiv.org/abs/1705.07750. [20] S. Xie, C. Sun, J. Huang, Z. Tu, and K. Murphy, “Rethinking spatiotemporal feature learning for video understanding,” CoRR, vol. abs/1712.04851, 2017. arXiv: 1712.04851. [Online]. Available: http://arxiv.org/abs/1712.04851. [21] A. Tunga, S. V. Nuthalapati, and J. P. Wachs, “Pose-based sign language recognition using GCN and BERT,” CoRR, vol. abs/2012.00781, 2020. arXiv: 2012.00781. [Online]. Available: https://arxiv.org/abs/2012.00781. [22] J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. arXiv: 1810.04805. [Online]. Available: http://arxiv.org/abs/1810.04805. [23] M. Boháček and M. Hrúz, “Sign pose-based transformer for word-level sign language recognition,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, Jan. 2022, pp. 182–191. [24] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” CoRR, vol. abs/ 1706.03762, 2017. arXiv: 1706.03762. [Online]. Available: http://arxiv.org/abs/1706. 03762. [25] H. Hu, W. Zhao, W. Zhou, and H. Li, “Signbert+: Hand-model-aware self-supervised pre-training for sign language understanding,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 11 221–11 239, Sep. 2023, ISSN: 1939-3539. DOI: 10.1109/tpami.2023.3269220. [Online]. Available: http://dx.doi.org/10.1109/ TPAMI.2023.3269220. [26] D. Laines, G. Bejarano, M. Gonzalez-Mendoza, and G. Ochoa-Ruiz, Isolated sign language recognition based on tree structure skeleton images, 2023. arXiv: 2304 . 05403 [cs.CV]. [27] M. Contributors. “Openmmlab pose estimation toolbox and benchmark.” Accessed: 2024- 06-02. (2020), [Online]. Available: https://github.com/open-mmlab/mmpose. [28] jin-s13. “Coco-wholebody.” (2020), [Online]. Available: https://github.com/jin- s13/ COCO-WholeBody/ (visited on 06/02/2024). [29] Z. Liu, H. Zhang, Z. Chen, Z. Wang, and W. Ouyang, “Disentangling and unifying graph convolutions for skeleton-based action recognition,” CoRR, vol. abs/2003.14111, 2020. arXiv: 2003.14111. [Online]. Available: https://arxiv.org/abs/2003.14111. [30] A. G. Howard, M. Zhu, B. Chen, et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” CoRR, vol. abs/1704.04861, 2017. arXiv: 1704. 04861. [Online]. Available: http://arxiv.org/abs/1704.04861. [31] S. Jiang, B. Sun, L. Wang, Y. Bai, K. Li, and Y. Fu, “Sign language recognition via skeleton-aware multi-model ensemble,” CoRR, vol. abs/2110.06161, 2021. arXiv: 2110. 06161. [Online]. Available: https://arxiv.org/abs/2110.06161. [32] R. Zuo, F. Wei, and B. Mak, Natural language-assisted sign language recognition, 2023. arXiv: 2303.12080 [cs.CV]. [Online]. Available: https://arxiv.org/abs/2303.12080. [33] D. Li, X. Yu, C. Xu, L. Petersson, and H. Li, “Transferring cross-domain knowledge for video sign language recognition,” CoRR, vol. abs/2003.03703, 2020. arXiv: 2003.03703. [Online]. Available: https://arxiv.org/abs/2003.03703.
|