|
[1]M. Afify, O. Siohan and R. Sarikaya, “Gaussian mixture language models for speech recognition”, in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 4, pp. 29-32, 2007. [2]J. Bellegarda, “Exploiting latent semantic information in statistical language modeling”, Proceedings of the IEEE, vol. 88, no. 8, pp. 1279-1296, 2000. [3]J. Bellegarda, “Statistical language model adaptation: review and perspectives,” Speech Communication, vol. 42, pp. 93-108, 2004. [4]Y. Bengio, R. Ducharme, P. Vincent and C. Jauvin, “A neural probabilistic language model”, Journal of Machine Learning Research, vol. 3, pp. 1137-1155, 2003. [5]M. Berry, S. Durmais and G. Obrien, “Using linear algebra for intelligent information retrieval,” SIAM Review, vol. 37, pp. 573-595, 1995. [6]D. Blei, A, Ng and M. Jordan, “ Latent Dirichlet allocation”, Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003. [7]P. Brown, J. Cocke, S. Della Pietra, V. Della Pietra, F. Jelinek, J. Lafferty, R. Mercer and P. Roossin, ”A statistical approach to machine translation,” Computational Linguistics, vol. 16, pp. 79-85, 1990. [8]P. Brown, V. Della Pietra, P. De Souza, J. Lai and R. Mercer, “Class-based n-gram models of natural language,” Computational Linguistics, vo. 18, no. 4, pp. 467-479, 1992. [9]C. Chelba and F. Jelinek, “Structured language modeling,” Computer Speech and Language, vol. 14, no. 4, pp. 283-332, 2000. [10]S. F. Chen and J. Goodman, “An empirical study of smoothing techniques for language modeling,” Computer Speech and Language, vo. 13, pp. 359-394, 1999. [11]J.-T. Chien, “Association pattern language modeling”, IEEE Trans. Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1719-1728, 2006. [12]J.-T. Chien, M.-S. Wu and H.-J. Peng, “On latent semantic language modeling and smoothing”, in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 2, pp. 1373-1376, 2004. [13]C.-H. Chueh, J.-T. Chien and H. Wang, “A maximum entropy approach for integrating semantic information in statistical language models”, in Proc. Internal Symposium on Chinese Spoken Language Processing, pp. 309-312, 2004. [14]S. Deerwester, S. Dumais, G. Furnas, T. Landauer and R. Harshman, “Indexing by latent semantic analysis,” Journal of the American Society of Information Science, vol. 41, pp. 391-407, 1990. [15]S. Della Pietra, V. Della Pietra R. Bercer and S. Roukos, “Adactive language modeling using minimum discriminant estimation,” in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 633-636, 1992. [16]A. Dempster, N. Laird and D. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, vol. 39, pp. 1-38, 1977. [17]A. Emami, P, Xu and F. Jelinek, “Using a connectionist model in a syntactical based language model,” in Proc. International Conference on Acoustic, Speech and Signal Processing, pp. 372-375, 2003. [18]M. Federico, “Bayesian estimation methods for n-gram language model adaptation,” in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 240-243, 1996. [19]M. Federico, “Efficient language model adaptation through MDI estimation,” in Proc. Eurospeech, pp. 1583-1586, 1999. [20]R. Florian and D. Yarowsky, “Dynamic nonlocal language model adaptation via hierarchical topic-based adaptation,” in Proc. ACL, pp. 167-174, 1999. [21]D. Gildea and T. Hofmann, “Topic-based language models using EM”, in Proc. Eurospeech, pp. 2167-2170, 1999. [22]I. J. Goodman, “The population frequencies of species and the estimation of population parameters,” Biometrika, vol. 40, pp. 237-264, 1953. [23]Hidden Markov Model Toolkit (HTK), http://htk.eng.cam.ac.uk/. [24]T. Hofmann, “Probabilistic latent semantic indexing”, in Proc. ACM SIGIR, pp. 50-57, 1999. [25]X. Huang, A. Acero and H. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, 2001 [26]F. Jelinek and R. Mercer, “Interpolated estimation of Markov source parameters from sparse data,” in Proc. Workshop in Pattern Recognition in Practice, pp. 381-402, 1980. [27]S. M. Katz, “Estimation of probabilities from sparse data for the language model component of a speech recognizer,” IEEE Trans. Acoustic, Speech and Signal Processing, vol. 35, pp. 400-401, 1987. [28]S. Khudanour and J. Wu, “Maximum entropy techniques for exploiting syntactic, semantic and collocational dependencies in language modeling,” Computer Speech and Language, vol. 14, pp. 355-372, 2000. [29]R. Kneser and H. Ney, “Improved backing-off for m-gram language modeling”, in Proc. International Conference on Acoustic, Speech and Signal Processing, pp. 181-184, 1995. [30]L. Lamel, R. Kassel, and S. Seneff, “Speech database development: design and analysis of the acoustic-phonetic corpus,“ in Proc. of the DARPA Speech Recognition Workshop, pp. 100-109, 1986. [31]C. Leggeter and P. Woodland, “Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models,” Computer Speech and Language, vol. 9, pp. 171-185, 1995. [32]G. Lidstone, “Note on general case of the Bayes-Laplace formula for inductive or posteriori probabilities,” Transition of the Faculty of Actuaries, vol. 8, pp. 182-192, 1920. [33]H. Ney, U. Essen and R. Kneser, “On structuring probabilistic dependencies in stochastic language modeling,” Computer Speech and Language, vol. 8, pp. 1-38, 1994. [34]D. Paul and J. Baker, “The design for Wall Street Journal based CSR corpus”, in Proc. International Conference on Spoken Language Processing, pp. 899-902, 1992. [35]J. Ponte and W. Croft, “A language modeling approach to information retrieval,” in Proc. SIGIR on Research and Development in Information Retrieval, pp. 275-281, 1998. [36]R. Rosenfeld, “A maximum entropy approach to adaptive statistical language modeling,” Computer Speech and Language, vol. 10, pp. 187-228, 1996. [37]H. Schwenk, “Continuous space language models”, Computer Speech and Language, vol. 21, pp. 492-518, 2007. [38]H. Schwenk and J. Gauvain, “Connectionist language modeling for large vocabulary speech recognition,” in Proc. International Conference on Acoustic, Speech and Signal Processing, pp. 765-768, 2002. [39]Y. Tam and T. Schultz, “Correlated latent semantic model for unsupervised LM adaptation”, in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 4, pp. 41-44, 2007. [40]K. Vertanen, “Baseline WSJ acoustic models for HTK and sphinx: training recipes and recognition experiments”, Technical Report, Cavendish Laboratory, 2006. [41]H. Wang and T. Kawahara, “PLSA-Based Topic Detection in Meetings for Adaptation of Lexicon and Language,” in Proc. Interspeech, pp. 602-608, 2007. [42]P. Woodland, J. Odell, V. Valtchev and S. Young, “Large vocabulary speech recognition using HTK,” in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 2, pp. 125-128, 1994. [43]J. Wu and S. Khudanpur, “Building a topic-dependent maximum entropy model for very large corpora,” in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 777-780, 2002. [44]G. Zhou and K. Liu, “Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition,” Computer Speech and Language, vol. 13, pp. 125-141, 1999. [45]I. Zitouni, “Backoff hierarchical class n-gram language models: effectiveness to model unseen events in speech recognition,” Computer Speech and Language, vol. 21, no. 1, pp. 88-104, 2007.
|