|
[1] J. Benesty, M. M. Sondhi, and Y. Huang, "Springer handbook of speech processing," Springer Science & Business Media, 2007. [2] P. C. Loizou, "Speech Enhancement: Theory and Practice," Taylor and F. Group, Eds. Boca Raton, FL, USA: CRC Press, 2013. [3] M. Weiss, E. Aschkenasy and T. "Parsons, Study and development of the INTEL technique for improving speech intelligibility," Technical Report NSC-FR/4023, Nicolet Scientific Corporation, Northvale, NJ, 1974 [4] S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. on Acoustics, Speech and Signal Processing, 113–120. 1979 [5] N. Wiener, "Extrapolation, Interpolation and Smoothing of Stationary Time Series with Engineering Applications," Cambridge, MA: MIT Press, 1949 [6] Y. Ephraim and D. Malah, "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator," IEEE Trans. on Acoustics, Speech and Signal Processing, 32(6), pp. 1109-1121, 1984. [7] Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Trans. on Acoustics, Speech and Signal Processing, 1985 [8] R. J. McAulay and M. L. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Trans. on Acoustics, Speech and Signal Processing, 1980 [9] J. Lim and A.V. Oppenheim, "Enhancement and bandwidth compression of noisy speech," in Proc. IEEE, 1979 [10] J. Lim and A.V. Oppenheim, "All-pole modeling of degraded speech," IEEE Trans. on Acoustics, Speech and Signal Processing, 1978 [11] M. Dendrinos, S. Bakamides, and G. Carayannis, "Speech enhancement from noise: A regenerative approach," Speech Communication, 10, 45–57, 1991 [12] Y. Ephraim and H. L. Van Trees, "A signal subspace approach for speech enhancement," in Proc. ICASSP, 1993 [13] G. Brown and M. Cooke, "Computational auditory scene analysis," Computer Speech and Language, 8, 297–336, 1994 [14] M. Weintraub, "A theory and computational model of auditory monaural sound separation," PhD thesis, Stanford University, Stanford, CA, 1985 [15] M. Cooke, P. Green, L. Josifovski and A. Vizinho, "Robust automatic speech recognition with missing and uncertain acoustic data," Speech Communication, 34, 267–285, 2001 [16] N. Roman, D. Wang and G. Brown, "Speech segregation based on sound localization," J. Acoust. Soc. Am., 114, 2236–2252, 2003 [17] G. Kim, Y. Lu, Y. Hu and P. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Am., 126(3), 1486–1494, 2009 [18] S. K. Mitra, "Digital Signal Processing: A Computer Based Approach, "4th, McGraw-Hill, 2011. [19] I. Goodfellow, Y. Bengio and A. Courville, "Deep learning," MIT Press, 2016 [20] P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. ICML, 2008. [21] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," J. Machine Learning Res., 11, 2010 [22] L. Xugang, Y. Tsao, S. Matsuda and C. Hori, "Speech enhancement based on deep denoising autoencoder." in Proc. Interspeech, 2013 [23] X. Yong, J. Du, L-R. Dai and C-H. Lee, "An experimental study on speech enhancement based on deep neural networks." IEEE Signal Processing Letters, pp. 65-68, 2014 [24] X. Yong, J. Du, L-R. Dai and C-H. Lee, "A Regression Approach to Speech Enhancement Based on Deep Neural Networks." IEEE/ACM Transactions on Audio, Speech, and Language Processing 23 (2015): 7-19. [25] C. K. Chui, "Biorthogonal wavelets, "Wavelets: A Tutorial in Theory and Applications, pp. 123-152, 1992. [26] C. Vonesch, T. Blu, and M. Unser, "Generalized Daubechies wavelet families, "IEEE Transactions on Signal Processing, vol. 55, no. 9, pp. 4415–4429, 2007. [27] N. Kanedera, T. Arai, H. Hermansky, and M. Pavel, "On the importance of various modulation frequencies for speech recognition," in Proc. Eurospeech, 1997. [28] C. Chen and J. Bilmes, "MVA processing of speech features," IEEE Trans. on Audio, Speech, and Language Processing, pp. 257-270, 2006. [29] X. Xiao, E. S. Chng and H. Li, "Normalization of the speech modulation spectra for robust speech recognition," IEEE Trans. on Audio, Speech, and Language Processing, vol. 16, no. 8, pp. 1662-1674, 2008. [30] S-K. Lee and J-W. Hung, "An evaluation study of using various SNR-level Training data in the denoising auto encoder (DAE) technique for speech enhancement," International Journal of Electrical, Electronics and Data Communication (IJEEDC), 2017. [31] L. L. Wong, S. D. Soli, S. Liu, N. Han, and M-W. Huang, "Development of the Mandarin hearing in noise test (MHINT)," Ear and Hearing, 28 (2), pp. 70-74, 2007 [32] A. W. Rix, J. G. Beerends, M. P. Hollier and A. P. Hekstra, "Perceptual evaluation of speech quality (PESQ) – a new method for speech quality assessment of telephone networks and codecs," in Proc. ICASSP, pp. 749-752, 2001 [33] E. Habets, "Room impulse response generator, "https://github.com/ehabets/RIR-Generator. [34] J. B. Allen and D. A. Berkley, "Image method for efficiently simulating small-room acoustics," Journal Acoustic Society of America, 65(4), pp. 943, 1979.
|