|
[1] L. Turicchia and R. Sarpeshkar, “A bio-inspired companding strategy for spectral enhancement,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 2, pp. 243–253, 2005. [2] X. Huang, A. Acero, and H. Hon, Spoken language processing: A guide to theory, algorithm, and system development. Prentice Hall PTR Upper Saddle River, NJ, USA, 2001. [3] B. Gold and N. Morgan, Speech and audio signal processing: processing and perception of speech and music. John Wiley & Sons, Inc. New York, NY, USA, 1999. [4] S. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357–366, 1980. [5] H. Hermansky, “Perceptual linear predictive (PLP) analysis of speech,” Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738–1752, 1990. [6] S. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979. [7] B. S. Atal, “Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification,” J. Acoust. Soc. Amer, pp. 1304–1312, 1974. [8] C. Chen and J. Bilmes, “MVA Processing of Speech Features,” IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 1, pp. 257–270, 2007. [9] X. Xiao, E. Chng, and H. Li, “Normalization of the Speech Modulation Spectra for Robust Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 8, pp. 1662–1674, 2008. [10] J. Droppo, L. Deng, and A. Acero, “Evaluation of the SPLICE algorithm on the Aurora2 database,” in Seventh European Conference on Speech Communication and Technology, ISCA, 2001. [11] M. Gales and S. Young, “Robust speech recognition in additive and convolutional noise using parallel model combination,” Computer speech & language(Print), vol. 9, no. 4, pp. 289–307, 1995. [12] P. Moreno, B. Raj, and R. Stern, “A vector Taylor series approach for environmentindependent speech recognition,” in IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, vol. 2, 1996. [13] C. Yang, F. Soong, and T. Lee, “Static and dynamic spectral features: Their noise robustness and optimal weights for ASR,” in IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings.(ICASSP’05), vol. 1, 2005. [14] J. Hung and L. Lee, “Optimization of temporal filters for constructing robust features in speech recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, pp. 808–832, 2006. [15] H. Hermansky and N. Morgan, “Rasta processing of speech,” IEEE Transactions on Speech Audio Processing, pp. 578–589, 1994. [16] H. Hermansky and S. Sharma, “TRAPS-Classifiers Of Temporal Patterns,” in Fifth International Conference on Spoken Language Processing, ISCA, 1998. [17] C. Avendano, S. Vuuren, and H. Hermansky, “Data based filter design for RASTA-like channel normalization in ASR,” in Fourth International Conference on Spoken Language Processing, ISCA, 1996. [18] C. Avendano and H. Hermansky, “On the properties of temporal processing for speech in adverse environments,” Applications of Signal Processing to Audio and Acoustics, 1997. 1997 IEEE ASSP Workshop on, Oct 1997. [19] D. Lee and R. Kil, “Auditory processing of speech signals for robust speech recognitionin real-world noisy environments,” IEEE Transactions on Speech and Audio Processing, vol. 7, no. 1, pp. 55–69, 1999. [20] J. Kates, “A time-domain digital cochlear model,” IEEE Transactions on signal processing, vol. 39, no. 12, pp. 2573–2592, 1991. [21] S. Haque, R. Togneri, and A. Zaknich, “Perceptual features for automatic speech recognition in noisy environments,” Speech Commun., vol. 51, no. 1, pp. 58–75, 2009. [22] B. Strope and A. Alwan, “A model of dynamic auditory perception and its application torobust word recognition,” IEEE Transactions on Speech and Audio Processing, vol. 5, no. 5, pp. 451–464, 1997. [23] B. Moore, An introduction to the psychology of hearing. Academic press, 2003. [24] A. Oxenham, “Forward masking: Adaptation or integration?,” The Journal of the Acoustical Society of America, vol. 109, p. 732, 2001. [25] M. Holmberg, D. Gelbart, and W. Hemmert, “Automatic speech recognition with an adaptation model motivated by auditory processing,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 1, pp. 43–49, 2006. [26] K. Park and S. Lee, “An engineering model of the masking for the noise-robust speech recognition,” Neurocomputing, vol. 52, no. 54, pp. 615–620, 2003. [27] S. Vuuren and H. Hermansky, “Data-driven design of RASTA-like filters,” in Fifth European Conference on Speech Communication and Technology, ISCA, 1997. [28] S. Haykin et al., “Adaptive filtering theory,” Book, Prentice-Hall Information Systems Science Series, 1986. [29] S. Haykin, “Adaptive filters,” Signal Processing Magazine. IEEE Computer Society, 1999. [30] J. Droppo, M. Mahajan, A. Gunawardana, and A. Acero, “How to train a discriminative front end with stochastic gradient descent and maximum mutual information,” in Proceedings IEEE Automatic Speech Recognition and Understanding Workshop, 2005. [31] H. Hirsch and D. Pearce, “The AURORA Experimental Framework For The Performance Evaluation of Speech Recognition Systems Under Noisy Conditions,” in ASR2000-Automatic Speech Recognition: Challenges for the new Millenium ISCA Tutorial and Research Workshop (ITRW), ISCA, 2000. [32] A. Moreno, B. Lindberg, C. Draxler, G. Richard, K. Choukry, S. Euler, and J. Allen, “SPEECH DAT CAR. A Large Speech Database For Automotive Environments,” [33] N. Parihar and J. Picone, “DSR front end LVCSR evaluation AU/384/02,” Aurora Working Group, European Telecommunications Standards Institute, 2002.
|