|
[1] L. Rabiner and B. H. Juang, Fundamental of Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1993. [2] L. R. Rabiner, ²A tutorial on hidden Markov models and selected applications in speech recognition,² Proc. IEEE, vol. 77, pp. 257-286, Feb. 1989. [3] T. K. Vintsyuk, ²Element-wise recognition of continuous speech consisting of words from a specified vocabulary,² Kibernetika (Cybernetics), vol. 7, no. 2, pp. 133-143, March-April 1971. [4] J. S. Bridle, M. D. Brown, and R. M. Chambrlain, ²An algorithm for connected word recognition,² in Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Paris, pp. 899-902, May 1982. [5] J. S. Bridle, M. D. Brown, and R. M. Chamberlain, ²Continuous connected word recognition whole word templates,² The Radio and Electronic Engineer, vol. 53 no. 4, pp. 167-175, April 1983. [6] H. Ney, ²The use of a one-stage dynamic programming algorithm for connected word recognition,² IEEE Trans. Acoustic, Speech, Signal Processing, vol. ASSP-32, no. 2, pp. 263-271, April 1984. [7] C. H. Lee, and L. R. Rabiner, ²A frame-synchronous network search algorithm for connected word recognition,² IEEE Trans. Acoustic, Speech, Signal Proessing, vol. 37, no. 11, pp. 1649-1658, November 1989. [8] D. Burshtein, ²Robust parametric modeling of durations in hidden Markov models,² IEEE Trans. Speech and Audio Processing, vol. 4, no. 3, pp. 240 -242, May 1996. [9] S. Ramachandrula, and S. Thippur, ²Connected phoneme HMMs with implicit duration modelling for better speech recognition,² in Proceedings of 1997 International Conference on Information, Communications and Signal Processing (ICICS), vol. 2, pp. 1024-1028, 1997. [10] P. Ramesh, and J.G. Wilpon, ²Modeling state durations in hidden Markov models for automatic speech recognition,² in Proc. Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 381-384, 1992. [11] B. Logan, and P. Moreno, ²Factorial HMMs for acoustic modeling,² in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 813-816, 1998. [12] Z. Ghahramani, and M. Jordan, ²Factorial hidden Markov models,² Computational Cognitive Science Technical Report 9502, July 1996. [13] M. Brand, ²Coupled hidden Markov models for modeling interacting processes,² MIT Media Lab Perceptual Computing/Learning and Common Sense Technical Report 405, June 1997. [14] T. Hazen, ²The use of speaker correlation information for automatic speech recognition,² Ph.D. Diss., Mass. Inst. Technical., Cambridge, Jan. 1998. [15] C. H. Lee, C. H. Lin, and B. H. Juang, ²A study on speaker adaptation of the parameters of continuous density hidden Markov models,² IEEE Trans. Signal Processing, vol. 39, pp. 806-814, 1991. [16] Y. Zhao, ²An acoustic-phonetic-based speaker adaptation technique for improving speaker-independent continuous speech recognition,² IEEE Trans. Speech and Audio Processing, vol. 2, no. 3, pp. 380-394, July 1994. [17] A. Sankar, and C.H. Lee, ²A maximum-likelihood approach to stochastic matching for robust speech recognition,² IEEE Trans. Speech and Audio Processing, vol. 4, no. 3, pp. 190-202, May 1996. [18] M. Bilginer Gülmezoğlu, Vakif Dzhafarov, Mustafa Keskin, and Ataiay Barkana, ²A novel approach to isolated word recognition,² IEEE Trains. Speech and Audio Processing, vol. 7, No. 6, pp. 620-628, Nov. 1999. [19] M. Bilginer Gülmezoğlu, Vakif Dzhafarov, and Ataiay Barkana, ²The common vector approach and its relation to principal component analysis,² IEEE Trains. Speech and Audio Processing, vol. 9, no. 6, pp. 655-662, Nov. 2001. [20] H. Y. Gu, C. Y. Tseng, and L. S. Lee, ²Isolated-utterance speech recognition using hidden Markov models with bounded state durations,² IEEE Trans. Signal Processing, vol. 39, no. 8, pp. 1743-1752, Aug. 1991. [21] C. H. Edwards, and D. E. Penney, Elementary Linear Algebra, Englewood Cliffs, NJ: Prentice-Hall, 1988. [22] L. Knockaert, ²An order-recursive algorithm for estimating pole-zero models,² IEEE Trans. Acoustic, Speech, Signal Processing, vol. ASSP-35, pp. 154-157, Feb. 1987. [23] S. Haykin, Neural Network, A Comprehensive Foundation, Macmillan College Publishing Company, Inc., 1994, pp. 363-370. [24] D. F. Morrison, Multivariate Statistical Methods. NY: McGraw-Hill, 1967, pp. 156-195. [25] A. Dempster, N. Laird, and D. Rubin, ²Maximum likelihood from incomplete data via the EM algorithm,² J. Royal Statist. Soc., vol.39, pp. 1-38, 1977. [26] B. H. Juang, ²Maximum-likelihood estimation for mixture multivariate stochastic observations of Markov chains,² AT\&T Tech. J., vol. 64, no. 6, pp. 1235-1249, 1985. [27] L. Baum, T. Petrie, G. Soules, and N. Weiss, ²A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,² Ann. Math. Statist., vol. 41, no. 1, pp. 164-171, 1970. [28] L. R. Liporace, ²Maximum likelihood estimation for multivariate observations of Markov sources² IEEE Trans. Inform. Theorey, IT-28, pp. 729-734, September, 1982. [29] L. Deng, M. Lennig, F. Seitz and P. Mermelstein. ²Large vocabulary word recognition using context-dependent allophonic hidden Markov models,² Computer Speech and Language, Vol. 4, No. 4, December, 1990, pp. 345-357. [30] S. J. Young, “Large vocabulary continuous speech recognition: a review,” in Proc. IEEE Workshop on Automatic Speech Recognition, Snowbird, Utah, 3-28, 1995. [31] C. Dugast, R. Kneser, X. Aubert, S. Ortmanns, K. Beulen, H. Ney, ²Continuous Speech Recognition Tests and Results for the NAB'94 Corpus,² Proc. ARPA Spoken Language Technology Workshop, Austin, TX, pp. 156-161, January 1995. [32] S.J. Young, P.C. Woodland, ²The Use of State Tying in Continuous Speech Recognition,² Proc. Europ. Conf. on Speech Communication and Technology, Berlin, pp. 2203-2206, September 1993. [33] L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees, The Wadsworth Statistics/Probability Series, Belmont, CA, 1984. [34] S.J. Young, J.J. Odell, P.C. Woodland, ²Tree-Based State Tying for High Accuracy Acoustic Modeling,² Proc. ARPA Human Language Technology Workshop, Plainsboro, NJ, pp. 405-410, Morgan Kaufmann, March 1994. [35] L. R. Bahl, P. V. de Souza, P. S. Gopalakrishnan, D. Nahamoo, and M. A. Picheny, ²Decision trees for phonological rules in continuous speech,² in Proc. Int. Conf. Acoustics, Speech, Signal Processing ’91, Toronto, ON, Canada, May 1991, pp. 185—188. [36] M.-Y. Hwang, X. Huang, and F. Alleva, ²Predicting unseen triphones with sesones,² in Proc. Int. Conf. Acoustics, Speech, Signal Processing ’93, Minneapolis, MN, 1993, pp. 311—314. [37] W. Reichl and W. Chou, “Decision tree state tying based on segmental clustering for acoustic modeling,” in Proc. Int. Conf. Acoustics, Speech, Signal Processing ’98, Seattle, WA, May 1998, pp. 801—804. [38] S. J. Young, “The general use of tying in phoneme-based HMM speech recognizers,” in Proc. Int. Conf. Acoustics, Speech, Signal Processing ’92, San Francisco, CA, 1992, pp. 569—572. [39] S. J. Young, J. J. Odell, and P. C. Woodland, “Tree based state tying for high accuracy modeling,” in ARPA Workshop Human Language Technology, Princeton, NJ, Mar. 1994, pp. 286—291. [40] X. Aubert and P. Beyerlein, “A Bottom-Up Approach for Handling Unseen Triphones in Large Vocabulary Continuers Speech Recognition,” Proceedings of the Fourth International Conference on Spoken Language Processing, pp. 14-17, Philadelphia, Pennsylvania, USA, October 1996. [41] J.J. Odell, “The Use of Context in Large Vocabulary Speech Recognition,” Ph.D. Thesis, Cambridge University, 1995. [42] L. Deng and J. Wu, “Hierarchical Partition of the Articulatory State Space for Overlapping-feature Based Speech Recognition,” Proceedings of the Fourth International Conference on Spoken Language Processing, pp. 2266-2269, Philadelphia, Pennsylvania, USA, October 1996. [43] L. Ariane, N. Yves and K. Roland, “Improving Decision Trees for Acoustic Modeling,” Proceedings of the Fourth International Conference on Spoken Language Processing, pp. 1053-1056, Philadelphia, Pennsylvania, USA, October 1996. [44] K. Beulen and H. Ney, “Automatic Question Generation for Decision Tree Based State Tying,” Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 805-808, Seattle, Washington, USA, May 1998. [45] H.-W. Hon, Vocabulary-Independent Speech Recognition: The VOCIND System, Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, Pittsburg, PA, 1992.
|