|
[1]R. Mithe, S. Indalkar, and N. Divekar, “Optical character recognition,” Int. J. Recent Technol. Eng., vol. 2, no. 1, pp. 72–75, 2013. [2]K. Jung, K. I. Kim, and A. K. Jain, “Text information extraction in images and video: A survey,” Pattern Recognit., vol. 37, no. 5, pp. 977–997, May 2004, doi: 10.1016/j.patcog.2003.10.012. [3]T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent trends in deep learning based natural language processing,” ieee Comput. Intell. Mag., vol. 13, no. 3, pp. 55–75, 2018. [4]Y. Kumar and N. Singh, “A Comprehensive View of Automatic Speech Recognition System-A Systematic Literature Review,” in 2019 International Conference on Automation, Computational and Technology Management (ICACTM), 2019, pp. 168–173. [5]Z.-Q. Zhao, P. Zheng, S. Xu, and X. Wu, “Object detection with deep learning: A review,” IEEE Trans. neural networks Learn. Syst., vol. 30, no. 11, pp. 3212–3232, 2019. [6]Z. Raisi, M. A. Naiel, P. Fieguth, S. Wardell, and J. Zelek, “Text Detection and Recognition in the Wild: A Review,” arXiv Prepr. arXiv2006.04305, 2020. [7]S. R. Narang, M. K. Jindal, and M. Kumar, “Ancient text recognition: a review,” Artif. Intell. Rev., pp. 1–42, 2020. [8]A. Flor, D. S. Neto, B. Leite, D. Bezerra, and A. H. Toselli, “HTR-Flor ++ : A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models,” no. i, 2020. [9]U. V. Marti and H. Bunke, “The IAM-database: An English sentence database for offline handwriting recognition,” Int. J. Doc. Anal. Recognit., vol. 5, no. 1, pp. 39–46, 2003, doi: 10.1007/s100320200071. [10]A. Vaswani et al., “Transformer: Attention is all you need,” Adv. Neural Inf. Process. Syst. 30, pp. 5998–6008, 2017. [11]T. Bluche and R. Messina, “Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Jul. 2017, vol. 1, pp. 646–651, doi: 10.1109/ICDAR.2017.111. [12]J. Puigcerver, “Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Jul. 2017, vol. 1, pp. 67–72, doi: 10.1109/ICDAR.2017.20. [13]Y. Chen et al., “Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution,” vol. 1, 2019, [Online]. Available: http://arxiv.org/abs/1904.05049. [14]J. Schmidhuber, “Deep Learning in neural networks: An overview,” Neural Networks, vol. 61, pp. 85–117, 2015, doi: 10.1016/j.neunet.2014.09.003. [15]A. Mishra, K. Alahari, and C. V Jawahar, “Scene text recognition using higher order language priors,” 2012. [16]T. Novikova, O. Barinova, P. Kohli, and V. Lempitsky, “Large-lexicon attribute-consistent text recognition in natural images,” in European conference on computer vision, 2012, pp. 752–765. [17]S. Tian et al., “Multilingual scene character recognition with co-occurrence of histogram of oriented gradients,” Pattern Recognit., vol. 51, pp. 125–134, 2016, doi: 10.1016/j.patcog.2015.07.009. [18]A. Graves, M. Liwicki, S. Fernández, R. Bertolami, H. Bunke, and J. Schmidhuber, “A novel connectionist system for unconstrained handwriting recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 5, pp. 855–868, 2009, doi: 10.1109/TPAMI.2008.137. [19]C. Computing, B. Data, V. S. Dhaka, M. Kumar, and P. Chaudhary, “Offline Handwritten English Script Recognition : A Survey,” Computing, 1929. [20]P. Natarajan, S. Saleem, R. Prasad, E. MacRostie, and K. Subramanian, “Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach,” Arab. Chinese Handwrit. Recognit., pp. 231–250, 2008, doi: 10.1007/978-3-540-78199-8_14. [21]W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” Bull. Math. Biophys., vol. 5, no. 4, pp. 115–133, 1943. [22]F. Rosenblatt, “The perceptron: a probabilistic model for information storage and organization in the brain.,” Psychol. Rev., vol. 65, no. 6, p. 386, 1958. [23]D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986. [24]Y. LeCun et al., “Backpropagation Applied to Handwritten Zip Code Recognition,” Neural Comput., vol. 1, no. 4, pp. 541–551, Dec. 1989, doi: 10.1162/neco.1989.1.4.541. [25]M. W. Gardner and S. R. Dorling, “Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences,” Atmos. Environ., vol. 32, no. 14–15, pp. 2627–2636, 1998. [26]Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791. [27]C. Szegedy et al., “Going deeper with convolutions,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015, vol. 07-12-June, pp. 1–9, doi: 10.1109/CVPR.2015.7298594. [28]O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, “Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition,” in 2012 IEEE international conference on Acoustics, speech and signal processing (ICASSP), 2012, pp. 4277–4280. [29]A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Adv. Neural Inf. Process. Syst., pp. 1–9, 2012, doi: http://dx.doi.org/10.1016/j.protcy.2014.09.007. [30]K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” pp. 1–14, Sep. 2014, doi: 10.1016/j.infsof.2008.09.005. [31]K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Cvpr, pp. 770–778, 2016, doi: 10.1109/CVPR.2016.90. [32]X.-Y. Y. Zhang, Y. Bengio, and C.-L. L. Liu, “Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark,” Pattern Recognit., vol. 61, pp. 348–360, Jun. 2017, doi: 10.1016/j.patcog.2016.08.005. [33]S. Kaur, S. Bawa, and R. Kumar, “A survey of mono- and multi-lingual character recognition using deep and shallow architectures : indic and non-indic scripts,” Artif. Intell. Rev., 2019, doi: 10.1007/s10462-019-09720-9. [34]Z. Tian, W. Huang, T. He, and Y. Qiao, “Detecting Text in Natural Image with Connectionist Text Proposal Network.” Accessed: Dec. 14, 2019. [Online]. Available: http://textdet.com/. [35]H. Li, P. Wang, M. You, and C. Shen, “Reading car license plates using deep neural networks,” Image Vis. Comput., vol. 72, pp. 14–23, Apr. 2018, doi: 10.1016/j.imavis.2018.02.002. [36]C.-Y. Lee and S. Osindero, “Recursive Recurrent Nets with Attention Modeling for OCR in the Wild,” Mar. 2016, Accessed: Mar. 21, 2019. [Online]. Available: http://arxiv.org/abs/1603.03101. [37]G. Ian, Y. Courville, and A. Bengio, Deep Learning, 1st ed. MIT Press, 2016. [38]S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735. [39]F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” 1999. [40]K. Cho et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv Prepr. arXiv1406.1078, 2014. [41]A. Graves and J. J. Schmidhuber, “Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks,” Adv. Neural Inf. Process. Syst. 21, NIPS’21, pp. 545–552, 2008, doi: 10.1007/978-1-4471-4072-6. [42]M. Z. Alom et al., “The history began from alexnet: A comprehensive survey on deep learning approaches,” arXiv Prepr. arXiv1803.01164, 2018. [43]M. Chen, U. Challita, W. Saad, C. Yin, and M. Debbah, “Machine learning for wireless networks with artificial intelligence: A tutorial on neural networks,” arXiv Prepr. arXiv1710.02913, vol. 9, 2017. [44]H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Stat., pp. 400–407, 1951. [45]J. C. Spall, “Adaptive stochastic approximation by the simultaneous perturbation method,” IEEE Trans. Automat. Contr., vol. 45, no. 10, pp. 1839–1853, 2000. [46]M. D. Zeiler, “ADADELTA: AN ADAPTIVE LEARNING RATE METHOD.” Accessed: Dec. 27, 2018. [Online]. Available: https://arxiv.org/pdf/1212.5701.pdf. [47]D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Dec. 2014, Accessed: Dec. 13, 2018. [Online]. Available: https://arxiv.org/abs/1412.6980. [48]J. Lotman, The structure of the artistic text, no. 7. University of Michigan/Michigan Slavic, 1977. [49]W. W. Bledsoe and I. Browning, “Pattern recognition and reading by machine,” in Papers presented at the December 1-3, 1959, eastern joint IRE-AIEE-ACM computer conference, 1959, pp. 225–232. [50]A. L. Koerich, R. Sabourin, and C. Y. Suen, “Large vocabulary off-line handwriting recognition: A survey,” Pattern Anal. Appl., vol. 6, no. 2, pp. 97–121, 2003. [51]Y. Assabie and J. Bigun, “Offline handwritten Amharic word recognition,” Pattern Recognit. Lett., vol. 32, no. 8, pp. 1089–1099, 2011, doi: 10.1016/j.patrec.2011.02.007. [52]J. Cowell and F. Hussain, “Amharic character recognition using a fast signature based algorithm,” in Proceedings of the International Conference on Information Visualisation, 2003, vol. 2003-January, pp. 384–389, doi: 10.1109/IV.2003.1218014. [53]A. T. Birhanu and R. Sethuraman, “Artificial neural network approach to the development of OCR for real life Amharic documents,” Int. J. Sci. Eng. Technol. Res., vol. 4, no. 1, pp. 141–147, 2015. [54]M. Meshesha and C. V Jawahar, “Optical character recognition of amharic documents,” African J. Inf. Commun. Technol., vol. 3, no. 2, 2007. [55]Y. Assabie and J. Bigun, “Writer-independent offline recognition of handwritten Ethiopic characters,” Proc. 11th ICFHR, pp. 652–657, 2008, [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.3715&rep=rep1&type=pdf. [56]B. Gatos, I. Pratikakis, K. Kepene, and S. J. Perantonis, “Text detection in indoor/outdoor scene images,” Proc. 1st Int. Work. Camera-Based Doc. Anal. Recognition, CBDAR 2005, pp. 127–132, 2005. [57]X. Wang, Y. Song, Y. Zhang, and J. Xin, “Natural scene text detection with multi-layer segmentation and higher order conditional random field based analysis,” Pattern Recognit. Lett., vol. 60, pp. 41–47, 2015. [58]J. J. Lee, P. H. Lee, S. W. Lee, A. Yuille, and C. Koch, “AdaBoost for text detection in natural scene,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Sep. 2011, pp. 429–434, doi: 10.1109/ICDAR.2011.93. [59]Q. Ye and D. Doermann, “Text detection and recognition in imagery: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 7, pp. 1480–1500, 2014. [60]J. Liang, D. Doermann, and H. Li, “Camera-based analysis of text and documents: A survey,” International Journal on Document Analysis and Recognition, vol. 7, no. 2–3. Springer-Verlag, pp. 84–104, Jul. 2005, doi: 10.1007/s10032-004-0138-z. [61]C. Yi, Y. Tian, Y. Zhu, C. Yao, and X. Bai, “Text string detection from natural scenes by structure-based partition and grouping,” IEEE Trans. Image Process., vol. 10, no. 1, pp. 2594–2605, 2016. [62]B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp. 2963–2970, doi: 10.1109/CVPR.2010.5540041. [63]J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide-baseline stereo from maximally stable extremal regions,” in Image and Vision Computing, Sep. 2004, vol. 22, no. 10 SPEC. ISS., pp. 761–767, doi: 10.1016/j.imavis.2004.02.006. [64]A. Mosleh, N. Bouguila, and A. Ben Hamza, “Image Text Detection Using a Bandlet-Based Edge Detector and Stroke Width Transform.,” in BMVC, 2012, pp. 1–12. [65]Cong Yao, Xiang Bai, Wenyu Liu, Yi Ma, and Zhuowen Tu, “Detecting texts of arbitrary orientations in natural images,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2012, pp. 1083–1090, doi: 10.1109/CVPR.2012.6247787. [66]H. Il Koo and D. H. Kim, “Scene text detection via connected component clustering and nontext filtering,” IEEE Trans. image Process., vol. 22, no. 6, pp. 2296–2305, 2013. [67]Q. Ye and D. Doermann, “Scene text detection via integrated discrimination of component appearance and consensus,” in International Workshop on Camera-Based Document Analysis and Recognition, 2013, pp. 47–59. [68]W. Huang, Y. Qiao, and X. Tang, “Robust scene text detection with convolution neural network induced mser trees,” in European conference on computer vision, 2014, pp. 497–511. [69]B. Shi, X. Bai, and S. Belongie, “Detecting Oriented Text in Natural Images by Linking Segments,” Mar. 2017, Accessed: Feb. 10, 2019. [Online]. Available: http://arxiv.org/abs/1703.06520. [70]Y. Liu and L. Jin, “Deep matching prior network: Toward tighter multi-oriented text detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1962–1969. [71]D. He et al., “Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp. 474–483, doi: 10.1109/CVPR.2017.58. [72]Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, and X. Bai, “Multi-Oriented Text Detection with Fully Convolutional Networks,” Apr. 2016, Accessed: Mar. 31, 2019. [Online]. Available: http://arxiv.org/abs/1604.04018. [73]S. Long, J. Ruan, W. Zhang, X. He, W. Wu, and C. Yao, “TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes,” Jul. 2018, Accessed: Feb. 10, 2019. [Online]. Available: http://arxiv.org/abs/1807.01544. [74]X. Zhou et al., EAST: An Efficient and Accurate Scene Text Detector. IEEE, 2017, pp. 2642–2651. [75]W. He, X.-Y. Zhang, F. Yin, and C.-L. Liu, “Deep Direct Regression for Multi-Oriented Scene Text Detection.” pp. 745–753, 2017, Accessed: Mar. 31, 2019. [Online]. Available: http://openaccess.thecvf.com/content_iccv_2017/html/He_Deep_Direct_Regression_ICCV_2017_paper.html. [76]P. Lyu, C. Yao, W. Wu, S. Yan, and X. Bai, “Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation,” Feb. 2018, Accessed: Feb. 10, 2019. [Online]. Available: http://arxiv.org/abs/1802.08948. [77]S. Zhu, R. Z.-P. of the I. C. on, and undefined 2016, “A text detection system for natural scenes with convolutional feature learning and cascaded classification,” cv-foundation.org, Accessed: Apr. 11, 2019. [Online]. Available: https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Zhu_A_Text_Detection_CVPR_2016_paper.html. [78]X. Liu, G. Meng, and C. Pan, “Scene text detection and recognition with advances in deep learning: a survey,” Int. J. Doc. Anal. Recognit., vol. 22, no. 2, pp. 143–162, 2019. [79]X. Bai, C. Yao, and W. Liu, “Strokelets: A learned multi-scale mid-level representation for scene text recognition,” IEEE Trans. Image Process., vol. 25, no. 6, pp. 2789–2802, 2016. [80]C.-Y. Lee, A. Bhardwaj, W. Di, V. Jagadeesh, and R. Piramuthu, “Region-based discriminative feature pooling for scene text recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 4050–4057. [81]J. L. Feild and E. G. Learned-Miller, “Improving open-vocabulary scene text recognition,” in 2013 12th International Conference on Document Analysis and Recognition, 2013, pp. 604–608. [82]V. Goel, A. Mishra, K. Alahari, and C. V Jawahar, “Whole is greater than sum of parts: Recognizing scene text words,” in 2013 12th International Conference on Document Analysis and Recognition, 2013, pp. 398–402. [83]K. Wang, B. Babenko, and S. Belongie, “End-to-end scene text recognition,” in Proceedings of the IEEE International Conference on Computer Vision, Nov. 2011, pp. 1457–1464, doi: 10.1109/ICCV.2011.6126402. [84]C. Shi, C. Wang, B. Xiao, Y. Zhang, and S. Gao, “Scene text detection using graph model built upon maximally stable extremal regions,” Pattern Recognit. Lett., vol. 34, no. 2, pp. 107–116, 2013. [85]B. Shi, X. Bai, and C. Yao, “An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 11, pp. 2298–2304, Jul. 2017, doi: 10.1109/TPAMI.2016.2646371. [86]B. Shi, X. Wang, P. Lyu, C. Yao, and X. Bai, “Robust Scene Text Recognition with Automatic Rectification,” Mar. 2016, Accessed: Mar. 21, 2019. [Online]. Available: http://arxiv.org/abs/1603.03915. [87]W. Liu, C. Chen, K.-Y. Wong, Z. Su, and J. Han, “STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition,” in Procedings of the British Machine Vision Conference 2016, 2016, pp. 43.1-43.13, doi: 10.5244/C.30.43. [88]M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, “Reading Text in the Wild with Convolutional Neural Networks,” Int. J. Comput. Vis., vol. 116, no. 1, pp. 1–20, Dec. 2016, doi: 10.1007/s11263-015-0823-z. [89]Z. Cheng, F. Bai, Y. Xu, G. Zheng, S. Pu, and S. Zhou, “Focusing attention: Towards accurate text recognition in natural images,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5076–5084. [90]M. Liao, B. Shi, and X. Bai, “TextBoxes++: A Single-Shot Oriented Scene Text Detector,” Jan. 2018, doi: 10.1109/TIP.2018.2825107. [91]C. L. Zitnick and P. Dollár, “Edge boxes: Locating object proposals from edges,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8693 LNCS, no. PART 5, pp. 391–405, 2014, doi: 10.1007/978-3-319-10602-1_26. [92]P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai, “Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, vol. 11218 LNCS, pp. 71–88, doi: 10.1007/978-3-030-01264-9_5. [93]T. Lin, P. Dollár, R. Girshick, … K. H.-P. of the, and undefined 2017, “Feature pyramid networks for object detection,” openaccess.thecvf.com, Accessed: Apr. 11, 2019. [Online]. Available: http://openaccess.thecvf.com/content_cvpr_2017/html/Lin_Feature_Pyramid_Networks_CVPR_2017_paper.html. [94]S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” Jun. 2015, Accessed: Apr. 02, 2019. [Online]. Available: http://arxiv.org/abs/1506.01497. [95]M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu, “TextBoxes: A Fast Text Detector with a Single Deep Neural Network,” Nov. 2016, Accessed: Apr. 02, 2019. [Online]. Available: http://arxiv.org/abs/1611.06779. [96]T. He, Z. Tian, W. Huang, C. Shen, Y. Qiao, and C. Sun, “An End-to-End TextSpotter with Explicit Alignment and Attention,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018, pp. 5020–5029, doi: 10.1109/CVPR.2018.00527. [97]X. Liu, D. Liang, S. Yan, … D. C.-P. of the, and undefined 2018, “Fots: Fast oriented text spotting with a unified network,” openaccess.thecvf.com, Accessed: Apr. 11, 2019. [Online]. Available: http://openaccess.thecvf.com/content_cvpr_2018/html/Liu_FOTS_Fast_Oriented_CVPR_2018_paper.html. [98]D. Povey, H. Hadian, P. Ghahremani, K. Li, and S. Khudanpur, “A time-restricted self-attention layer for ASR,” ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 2018-April, pp. 5874–5878, 2018, doi: 10.1109/ICASSP.2018.8462497. [99]B. S. Mendisu and C. B. Efforts, “The Ethiopic Script: Linguistic Features and Socio-cultural Connotations,” Oslo Stud. Lang., vol. 8, no. 1, pp. 137–172, 2017. [100]C.-L. Liu, F. Yin, D.-H. Wang, and Q.-F. Wang, “Online and offline handwritten Chinese character recognition: Benchmarking on new databases,” Pattern Recognit., vol. 46, no. 1, pp. 155–162, Jan. 2013, doi: 10.1016/j.patcog.2012.06.021. [101]S. España-Boquera, M. J. Castro-bleda, J. Gorbe-moya, and F. Zamora-martinez, “Improving Offline Handwritten Text Recognition with Hybrid HMM / ANN Models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 4, pp. 767–779, Apr. 2011, doi: 10.1109/TPAMI.2010.141. [102]Y. Aiquan, B. Gang, J. Lijing, and L. Yajie, “Offline handwritten English character recognition based on convolutional neural network,” 2012 10th IAPR Int. Work. Doc. Anal. Syst., pp. 125–129, 2012, doi: 10.1109/DAS.2012.61. [103]P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” Proc. 25th Int. Conf. Mach. Learn. - ICML ’08, pp. 1096–1103, 2008, doi: 10.1145/1390156.1390294. [104]G. E. Hinton, S. Osindero, and Y.-W. Teh, “A Fast Learning Algorithm for Deep Belief Nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554, 2006, doi: 10.1162/neco.2006.18.7.1527. [105]E.-S. Ahmed, M. Loey, and H. EL-Bakry, “Arabic Handwritten Characters Recognition using Convolutional Neural Network,” vol. 5, no. 1, pp. 11–19, 2017. [106]K. Dutta, P. Krishnan, M. Mathew, and C. V. Jawahar, “Offline handwriting recognition on devanagari using a new benchmark dataset,” in Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, Jun. 2018, pp. 25–30, doi: 10.1109/DAS.2018.69. [107]Y. Zhang, “Deep Convolutional Network for Handwritten Chinese Character Recognition,” Comput. Sci. Dep. Stanford Univ., 2015, [Online]. Available: https://pdfs.semanticscholar.org/4941/aed85462968e9918110b4ba740c56030fd23.pdf. [108]C. Viard-Gaudin, P. M. Lallican, S. Knerr, and P. Binter, “The IRESTE On/Off (IRONOFF) dual handwriting database,” in Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR ’99 (Cat. No.PR00318), 1999, pp. 455–458, doi: 10.1109/ICDAR.1999.791823. [109]S. Sun, Z. Cao, H. Zhu, and J. Zhao, “A survey of optimization methods from a machine learning perspective,” IEEE Trans. Cybern., 2019. [110]J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization.,” J. Mach. Learn. Res., vol. 12, no. 7, 2011. [111]M. Matsugu, K. Mori, Y. Mitari, and Y. Kaneda, “Subject independent facial expression recognition with robust face detection using a convolutional neural network,” in Neural Networks, Jun. 2003, vol. 16, no. 5–6, pp. 555–559, doi: 10.1016/S0893-6080(03)00115-1. [112]F. S. Panchal and M. Panchal, “Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network,” Int. J. Comput. Sci. Mob. Comput., vol. 311, no. 11, pp. 455–464, 2014, [Online]. Available: http://www.ijcsmc.com/docs/papers/November2014/V3I11201499a19.pdf. [113]X. Zhang, Y. Bengio, and C. Liu, “Online and O ffl ine Handwritten Chinese Character Recognition: A Comprehensive Study and New Benchmark,” pp. 1–21, 2016. [114]U.-V. Marti and H. Bunke, “The IAM-database: an English sentence database for offline handwriting recognition,” Int. J. Doc. Anal. Recognit., vol. 5, no. 1, pp. 39–46, Nov. 2002, doi: 10.1007/s100320200071. [115]T. Tieleman, G. H.-C. N. networks for machine learning, and undefined 2012, “Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude.” [116]K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification.” Accessed: Nov. 01, 2020. [Online]. Available: https://www.cv-foundation.org/openaccess/content_iccv_2015/html/He_Delving_Deep_into_ICCV_2015_paper.html. [117]S. Ioffe, “Batch Renormalization: Towards reducing minibatch dependence in batch-normalized models,” Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Nips, pp. 1946–1954, 2017. [118]A. Dutta and A. Zisserman, “The {VIA} Annotation Software for Images, Audio and Video,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, doi: 10.1145/3343031.3350535. [119]T. M. Breuel, “ The OCRopus open source OCR system,” in Document Recognition and Retrieval XV, Jan. 2008, vol. 6815, pp. 68150F-68150F–15, doi: 10.1117/12.783598. [120]L. Neumann, J. M.-P. of the IEEE, and undefined 2013, “Scene text localization and recognition with oriented stroke detection,” openaccess.thecvf.com, Accessed: Apr. 11, 2019. [Online]. Available: http://openaccess.thecvf.com/content_iccv_2013/html/Neumann_Scene_Text_Localization_2013_ICCV_paper.html. [121]L. Neumann and J. Matas, “A method for text localization and recognition in real-world images,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6494 LNCS, no. PART 3, Springer, Berlin, Heidelberg, 2011, pp. 770–783. [122]J. Ma et al., “Arbitrary-Oriented Scene Text Detection via Rotation Proposals,” Mar. 2017, doi: 10.1109/TMM.2018.2818020. [123]B. Su and S. Lu, “Accurate Scene Text Recognition Based on Recurrent Neural Network,” 2015, pp. 35–48. [124]M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, “Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition,” Jun. 2014, Accessed: Mar. 02, 2019. [Online]. Available: http://arxiv.org/abs/1406.2227. [125]A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks BT - ICML ’06: Proceedings of the 23rd international conference on Machine learning,” pp. 369–376, 2006, doi: 10.1145/1143844.1143891. [126]M. Buta, L. Neumann, and J. Matas, “FASText: Efficient unconstrained scene text detector,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, vol. 2015 Inter, pp. 1206–1214, doi: 10.1109/ICCV.2015.143. [127]C. Yao, X. Bai, N. Sang, X. Zhou, S. Zhou, and Z. Cao, “Scene Text Detection via Holistic, Multi-Channel Prediction,” Jun. 2016, Accessed: Mar. 31, 2019. [Online]. Available: http://arxiv.org/abs/1606.09002. [128]D. Deng, H. Liu, X. Li, and D. Cai, “PixelLink: Detecting Scene Text via Instance Segmentation,” undefined, Jan. 2018, Accessed: Feb. 10, 2019. [Online]. Available: http://arxiv.org/abs/1801.01315. [129]P. He, W. Huang, Y. Qiao, C. C. Loy, and X. Tang, “Reading Scene Text in Deep Convolutional Sequences,” Jun. 2015, Accessed: Apr. 02, 2019. [Online]. Available: https://arxiv.org/abs/1506.04395. [130]H. Li, P. Wang, C. S.-P. of the IEEE, and undefined 2017, “Towards end-to-end text spotting with convolutional recurrent neural networks,” openaccess.thecvf.com, Accessed: Apr. 11, 2019. [Online]. Available: http://openaccess.thecvf.com/content_iccv_2017/html/Li_Towards_End-To-End_Text_ICCV_2017_paper.html. [131]M. Busta, L. Neumann, and J. Matas, “Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2017-Octob, pp. 2223–2231, 2017, doi: 10.1109/ICCV.2017.242. [132]A. Gupta, A. Vedaldi, and A. Zisserman, “Synthetic Data for Text Localisation in Natural Images,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Apr. 2016, vol. 2016-Decem, pp. 2315–2324, doi: 10.1109/CVPR.2016.254. [133]S. Zagoruyko and N. Komodakis, “Wide Residual Networks,” 2016, doi: 10.5244/C.30.87. [134]S. Xie, R. Girshick, P. Dollár, … Z. T.-P. of the I., and undefined 2017, “Aggregated residual transformations for deep neural networks,” openaccess.thecvf.com, Accessed: Dec. 14, 2019. [Online]. Available: http://openaccess.thecvf.com/content_cvpr_2017/html/Xie_Aggregated_Residual_Transformations_CVPR_2017_paper.html. [135]X. Zhang, X. Zhou, M. Lin, J. S.-P. of the IEEE, and undefined 2018, “Shufflenet: An extremely efficient convolutional neural network for mobile devices,” openaccess.thecvf.com, Accessed: Dec. 14, 2019. [Online]. Available: http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_ShuffleNet_An_Extremely_CVPR_2018_paper.html. [136]F. C.-P. of the I. conference on and undefined 2017, “Xception: Deep learning with depthwise separable convolutions,” openaccess.thecvf.com, Accessed: Dec. 14, 2019. [Online]. Available: http://openaccess.thecvf.com/content_cvpr_2017/html/Chollet_Xception_Deep_Learning_CVPR_2017_paper.html. [137]A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” [138]K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, Dec. 2017, vol. 2017-October, pp. 2980–2988, doi: 10.1109/ICCV.2017.322. [139]N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, “Soft-NMS - Improving Object Detection with One Line of Code,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2017-Octob, pp. 5562–5570, 2017, doi: 10.1109/ICCV.2017.593. [140]L. Wu, T. Li, L. Wang, and Y. Yan, “Improving hybrid CTC/Attention architecture with time-restricted self-attention CTC for end-to-end speech recognition,” Appl. Sci., vol. 9, no. 21, pp. 1–14, 2019, doi: 10.3390/app9214639. [141]D. Karatzas et al., “ICDAR 2013 robust reading competition,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Aug. 2013, pp. 1484–1493, doi: 10.1109/ICDAR.2013.221. [142]D. Karatzas et al., “ICDAR 2015 competition on Robust Reading,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Aug. 2015, vol. 2015-Novem, pp. 1156–1160, doi: 10.1109/ICDAR.2015.7333942. [143]C. K. Ch’Ng and C. S. Chan, “Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Oct. 2017, vol. 1, pp. 935–942, doi: 10.1109/ICDAR.2017.157. [144]L. Gómez and D. Karatzas, “TextProposals: A text-specific selective search algorithm for word spotting in the wild,” Pattern Recognit., vol. 70, pp. 60–74, 2017, doi: 10.1016/j.patcog.2017.04.027. [145]M. Busta, L. Neumann, J. M.-P. of the IEEE, and undefined 2017, “Deep textspotter: An end-to-end trainable scene text localization and recognition framework,” openaccess.thecvf.com, Accessed: Dec. 16, 2019. [Online]. Available: http://openaccess.thecvf.com/content_iccv_2017/html/Busta_Deep_TextSpotter_An_ICCV_2017_paper.html. [146]M. Bušta, Y. Patel, and J. Matas, “E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text,” Jan. 2018, Accessed: Apr. 11, 2019. [Online]. Available: http://arxiv.org/abs/1801.09919. [147]W. Wang et al., “Shape Robust Text Detection with Progressive Scale Expansion Network,” Mar. 2019, Accessed: Jan. 06, 2020. [Online]. Available:
|