|
[1] J. L. Elman, "Finding structure in time," Cognitive Science, vol. 14, no. 2, pp. 179-211, 1990. [2] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. [3] K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using rnn encoderdecoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014. [4] Y. LeCun, Y. Bengio, et al., "Convolutional networks for images, speech, and time series," The handbook of brain theory and neural networks, vol. 3361, no. 10, p. 1995, 1995. [5] A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, "WaveNet: A generative model for raw audio," arXiv preprint arXiv:1609.03499, 2016. [6] S. Bai, J. Z. Kolter, and V. Koltun, "An empirical evaluation of generic convolutional and recurrent networks for sequence modeling," arXiv preprint arXiv:1803.01271, 2018. [7] J. Chung, K. Kastner, L. Dinh, K. Goel, A. C. Courville, and Y. Bengio, "A recurrent latent variable model for sequential data," in Advances in neural in- formation processing systems, pp. 2980-2988, 2015. [8] E. Aksan and O. Hilliges, "Stcn: Stochastic temporal convolutional networks," arXiv preprint arXiv:1902.06568, 2019. [9] X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectier neural networks," in Proc. of the Fourteenth International Conference on Articial Intelligence and Statistics, pp. 315-323, 2011. [10] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation," Technical Report,DTIC Document, 1985. [11] H. Robbins and S. Monro, "A stochastic approximation method," The annals of mathematical statistics, pp. 400-407, 1951. [12] J. Kiefer, J.Wolfowitz, et al., "Stochastic estimation of the maximum of a regression function," The Annals of Mathematical Statistics, vol. 23, no. 3, pp. 462-466, 1952. [13] N. Qian, "On the momentum term in gradient descent learning algorithms," Neural Networks, vol. 12, no. 1, pp. 145-151, 1999. [14] R. J. Williams and D. Zipser, "A learning algorithm for continually running fully recurrent neural networks," Neural computation, vol. 1, no. 2, pp. 270-280, 1989. [15] T. Catfolis, "A method for improving the real-time recurrent learning algorithm," Neural Networks, vol. 6, no. 6, pp. 807-821, 1993. [16] K. Funahashi and Y. Nakamura, "Approximation of dynamical systems by continuous time recurrent neural networks," Neural networks, vol. 6, no. 6, pp. 801- 806, 1993. [17] R. J. Williams and D. Zipser, "Gradient-based learning algorithms for recurrent networks and their computational complexity," Back-Propagation: Theory, Architectures and Applications, pp. 433-486, 1995. [18] M. Boden, "A guide to recurrent neural networks and backpropagation," 2001. [19] Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencies with gradient descent is dicult," IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157-166, 1994. [20] F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, "Learning precise timing with LSTM recurrent networks," The Journal of Machine Learning Research, vol. 3, pp. 115-143, 2003. [21] A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645-6649, 2013. [22] F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," in Proc. of International Conference on Learning Representations, 2016. [23] C.-Y. Liou, W.-C. Cheng, J.-W. Liou, and D.-R. Liou, "Autoencoder for words," Neurocomputing, vol. 139, pp. 84-96, 2014. [24] J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng, "Multimodal deep learning," in Proceedings of the 28th international conference on machine learning (ICML-11), pp. 689-696, 2011. [25] M. J. Beal, Variational Algorithms for Approximate Bayesian Inference. PhD thesis, University of London, 2003. [26] M. J.Wainwright and M. I. Jordan, "Graphical models, exponential families, and variational inference," Foundations and Trends R in Machine Learning, vol. 1, no. 1{2, pp. 1{305, 2007. [27] D. P. Kingma and M. Welling, "Auto-encoding variational Bayes," in Proc. of International Conference on Learning Representations, 2014. [28] D. J. Rezende, S. Mohamed, and D. Wierstra, "Stochastic backpropagation and approximate inference in deep generative models," in Proc. of The 31st International Conference on Machine Learning, pp. 1278-1286, 2014. [29] C. K. Snderby, T. Raiko, L. Maale, S. K. Snderby, and O. Winther, "Ladder variational autoencoders," in Proc. of Advances in Neural Information Processing Systems, pp. 3738-3746, 2016. [30] S. Chang, Y. Zhang, W. Han, M. Yu, X. Guo, W. Tan, X. Cui, M. Witbrock, M. A. Hasegawa-Johnson, and T. S. Huang, "Dilated recurrent neural networks," in Advances in Neural Information Processing Systems, pp. 77-87, 2017. [31] T. N. Sainath, O. Vinyals, A. Senior, and H. Sak, "Convolutional, long shortterm memory, fully connected deep neural networks," in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580-4584, IEEE, 2015. [32] G. Lai, B. Li, G. Zheng, and Y. Yang, "Stochastic wavenet: A generative latent variable model for sequential data," arXiv preprint arXiv:1806.06116, 2018. [33] M. Liang and X. Hu, "Recurrent convolutional neural network for object recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3367-3375, 2015. [34] R. Cui, H. Liu, and C. Zhang, "Recurrent convolutional neural networks for continuous sign language recognition by staged optimization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7361-7369, 2017. [35] N. McLaughlin, J. Martinez del Rincon, and P. Miller, "Recurrent convolutional network for video-based person re-identication," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1325-1334, 2016. [36] X. Wang, W. Jiang, and Z. Luo, "Combination of convolutional and recurrent neural network for sentiment analysis of short texts," in Proceedings of COLING 2016, the 26th international conference on computational linguistics: Technical papers, pp. 2428-2437, 2016. [37] X. Zhang, F. Chen, and R. Huang, "A combination of rnn and cnn for attentionbased relation classication," Procedia computer science, vol. 131, pp. 911-917, 2018. [38] J. Koutnik, K. Gre, F. Gomez, and J. Schmidhuber, "A clockwork rnn," arXiv preprint arXiv:1402.3511, 2014. [39] D. Neil, M. Pfeier, and S.-C. Liu, "Phased lstm: Accelerating recurrent network training for long or event-based sequences," in Advances in neural information processing systems, pp. 3882-3890, 2016. [40] J. Chung, S. Ahn, and Y. Bengio, "Hierarchical multiscale recurrent neural networks," arXiv preprint arXiv:1609.01704, 2016. [41] F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," arXiv preprint arXiv:1511.07122, 2015. [42] G. Lai, W.-C. Chang, Y. Yang, and H. Liu, "Modeling long-and short-term temporal patterns with deep neural networks," in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 95- 104, ACM, 2018. [43] V. Campos, B. Jou, X. Giro-i Nieto, J. Torres, and S.-F. Chang, "Skip rnn: Learning to skip state updates in recurrent neural networks," arXiv preprint arXiv:1708.06834, 2017. [44] T. Mikolov and G. Zweig, "Context dependent recurrent neural network language model," in IEEE Workshop on Spoken Language Technology, pp. 234-239, 2012. [45] C. Szegedy, S. Ioe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inceptionresnet and the impact of residual connections on learning," in Thirty-First AAAI Conference on Articial Intelligence, 2017. [46] K. Soomro, A. R. Zamir, and M. Shah, " Ucf101: A dataset of 101 human actions classes from videos in the wild," arXiv preprint arXiv:1212.0402, 2012. [47] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
|