|
[1] J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018 [2] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh. Realtime multiperson 2d pose estimation using part affinity fields. In CVPR, 2017 [3]Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. arXiv preprint arXiv:1605.07146, 2016. [4] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997. [5]V. Ren, N. Mosca, M. Nitti, T. DOrazio, C. Guaragnella, D. Campagnoli, A. Prati, and E. Stella. A technology platform for automatic high-level tennis game analysis. CVIU, 2017 [6]F. Yoshikawa, T. Kobayashi, K. Watanabe, and N. Otsu. Automated service scene detection for badminton game analysis using CHLAC and MRA [7] G. Bertasius, H. S. Park, S. X. Yu, and J. Shi. Am i a baller? basketball performance assessment from first-person videos. In ICCV, 2017. [8] M. Sukhwani and C. Jawahar. Tennisvid2text: Fine-grained descriptions for domain specific videos. In BMVC, 2015. [9] M. Sukhwani and C. Jawahar. Frame level annotations for tennis videos. In ICPR, 2016 [10] D. Farin, S. Krabbe, P. H. N. de With, and W. Effelsberg, “Robust camera calibration for sport videos using court models,” Storage and Retrieval Methods and Applications for Multimedia (SPIE), vol. 5307, pp. 80–91, 2004. [11] J. Han, D. Farin and P.H.N. de With, “A real-time augmented-reality system for sports broadcast video enhancement,” Proc. ACM Multimedia, vol. 1, pp. 337-340, Augsburg, Germany, 2007. [12] J. Puwein, R. Ziegler, L. Ballan, and M. Pollefeys. PTZ Camera Network Calibration from Moving People in Sports Broadcasts. In Proc. WACV, 2012. [13] M. Stein, H. Janetzko, A. Lamprecht, T. Breitkreutz, P. Zimmermann B. Goldlucke, T. Schreck, G. L. Andrienko, M. Grossniklaus, and D. A. ¨ Keim. Bring it to the pitch: Combining video and movement data to enhance team sport analysis. IEEE Transactions on Visualization and Computer Graphics., 24(1):13–22, 2018. [14] P.-C. Wen, W.-C. Cheng, Y.-S. Wang, H.-K. Chu, N. C. Tang, and H.-Y. M. Liao. Court reconstruction for camera calibration in broadcast basketball videos. IEEE TVCG, 22(5):1517–1526, 2016. [15] H. Ben Shitrit, J. Berclaz, F. Fleuret and P. Fua, "Tracking multiple people under global appearance constraints," 2011 International Conference on Computer Vision, Barcelona, 2011, pp. 137-144, doi:10.1109/ICCV.2011.6126235. [16] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014. [17] R. Girshick. Fast R-CNN. arXiv:1504.08083, 2015. [18] J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders. Selective search for object recognition. IJCV, 2013. [19] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS, 2015. [20] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. arXiv preprint arXiv:1506.02640, 2015 [21] J. Redmon and A. Farhadi. Yolo9000: Better, faster, stronger. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 6517–6525. IEEE, 2017. [22] M. S. Ibrahim, S. Muralidharan, Z. Deng, A. Vahdat, and G. Mori. A hierarchical deep temporal model for group activity recognition. In CVPR, 2016. [23] W.-T. Chu and S. Situmeang. Badminton Video Analysis based on Spatiotemporal and Stroke Features. In ICMR, 2017. [24] A. Ghosh, S. Singh, and C. Jawahar, “Towards structured analysis of broadcast badminton videos,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018, pp. 296–304. [25] T. Hsu et al., "CoachAI: A Project for Microscopic Badminton Match Data Collection and Tactical Analysis," 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS), Matsue, Japan, 2019, pp. 1-4, doi: 10.23919/APNOMS.2019.8893039. [26] Y.-C. Huang, “TrackNet: Tennis ball tracking from broadcast video by deep learning networks,” Master’s thesis, National Chiao Tung University, Hsinchu City, Taiwan, 19 April 2018, advised by Chih-Wei Yi. [27] I. Zeki Yalniz, Herv’e J’egou, Kan Chen, Manohar Paluri, and Dhruv Mahajan. Billion-scale semi-supervised learning for image classification. Arxiv 1905.00546, 2019. [28] Qizhe Xie, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le. Self-training with noisy student improves ImageNet classification. arXiv preprint arXiv:1911.04252, 2019. [29] David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin Raffel. Mixmatch: A holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249, 2019. [30] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412, 2017. [31]K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015. [32]T. de Campos, B. Babu, and M. Varma. Character recognition in natural images. In VISAPP, Feb. 2009. [33]T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft COCO: common objects in ´ context,” in ECCV, 2014 [34]K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015 [35]Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122v2, 2017 [36]Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le. Unsupervised data augmentation for consistency training. arXiv preprint arXiv:1904.12848, 2019.
|