|
[Andrej, 1986] Andrej, L. and Frank, F., “Synthesis of Natural Sounding Pitch Contours in Isolated Utterances Using Hidden Markov Models”, IEEE Trans. on Acoustic, Speech and Signal Processing, vol. ASSP-34, no.5, pp.1074-1080, October 1986 [Benijamin, 1994] Benijamin, A., Chilin S. and Richard S., “A Corpus-Based Mandarin Text-to-Speech Synthesizer”, in Proc of ICSLP, S29, 8.1-8.4, pp. 1771-1774, 1994 [Breiman, 1984] Breiman, L., Friedman, J.H., Olshen, R. A. and C.J. Stone,” Classification and Regression Trees”, Chapman Hall, New York, 1984 [Chan, 1994] Chan, M. V., Feng, X., Heinen, J. A. and Niederjohn, R. J., “Classification of Speech Accents with Neural Networks”, Neural Networks, IEEE World Congress on Computational Intelligence., IEEE International Conference on, vol.7, pp. 4483-4486, 1994 [Chen, 1990] Chen, S. H. and Wang Y. R., “Vector Quantization of Pitch Information in Mandarin Speech”, IEEE Trans. on Communications, Vol. 38, No. 9, pp. 1317-1320, 1990 [Chen, 1995] Chen, S. H. and Wang, Y. R., ”Tone Recognition of Continuous Mandarin Speech Based on Neural Networks”, IEEE Trans. on Speech and Audio processing, vol. 3, no.2, pp.146-150, March 1995 [Chen, 1998] Chen, S. H., Hwang, S. H. and Wang, Y. R., “An RNN-based Prosodic Information Synthesizer for Mandarin Text-to-Speech”, IEEE Trans. on Speech and Audio Processing, vol. 6, no.3, pp.226-269, 1998 [Chen, 2005] Chen S. H., Lai, W. H. and Wang, Y. R., “A Statistics-based Pitch Contour Model for Mandarin Speech”, The Journal of the Acoustical Society of America, 117(2), pp. 908-925, 2005 [Chu, 2001] Chu, M. and Qian, Y., “Locating Boundaries for Prosodic Constituents in Unrestricted Mandarin Texts”, Computational Linguistics and Chinese Language Processing, 6(1), pp. 61-82, 2001 [Dong, 2002] Dong, M. and Lua, K. T., “Pitch Contour Model for Chinese Text-to-Speech Using CART and Statistical Model”, in Proc. of ICSLP, pp. 2405-2408, 2002 [Fujisaki, 1984] Fujisaki, H. and Hirose, K., “Analysis of voice fundamental frequency contours for declarative sentences of Japanese”, Journal of Acoustic Society, Japan, 1984 [Fukada, 1992] Fukada, T., Tokuda, K., Kobayashi, T. and Imai, S., “An adaptive algorithm for mel-cepstral analysis of speech,” in Proc. of ICASSP, vol.1, pp.137–140, 1992 [Greg, 2000] Greg, P. K. and Shih, C., “Stem-ML: Language-Independent Prosody Description”, in Proc. of ICSLP, pp. 239-242, 2000 [Huang, 2004] Huang, C., Shi, Y., Zhou, J. L., Chu, M., Wang, T., and Chang, E., “Segmental Tonal Modeling for Phone Set Design in Mandarin LVCSR”, in Proc. of ICASSP, pp.901-904, 2004 [Kawahara, 1997] Kawahara, H., “Speech Representation and Transformation Using Adaptive Interpolation of Weighted Spectrum: Vocoder Revisited”, in Proc. of ICASSP, vol. 2, pp. 1303-1306, Munich, Germany, April 1997 [Kim, 1997] Kim, S. H., and Kim, J. Y., “Efficient Model of Establishing Words Tone Dictionary for Korean TTS System”, in Proc. of Eurospeech, pp. 243-246, 1997 [Ladd, 1996] Ladd, D. R., “Intonational phonology”, Cambridge Studies in Linguistics 79. Cambridge: Cambridge University Press. 334 pages, 1996 [Lee, 1989] Lee, L. S., Tseng, C. Y. and Ouh-young M., “The Synthesis Rules in a Chinese Text-to-speech System”, IEEE Trans. on Acoustic, Speech and Signal Processing, vol. 37, no. 9, pp. 1309-1319, September 1989 [Lee, 1993] Lee, L. S., Tseng, C. Y. and Hsieh, C. J., “Improved Tone Concatenation Rules in a Formant-Based Chinese Text-to-Speech System”, IEEE Trans. on Speech and Audio processing, vol. 1, no.3, pp.287-294, July 1993 [Lin, 1992] Lin, T. and Wang, L. J., “Phonetic Tutorials”, Beijing University Press, pp. 103-121, 1992 [Lin, 1999] Lin, X., Chen, Y., Lim, S. and Lim, C., “Recognition of Emotional State From Spoken Sentences”, IEEE 3rd workshop on Multimedia Signal Processing, pp. 469-473, 1999 [Masuko, 1996] Masuko, T., Tokuda, K., Kobayashi, T. and Imai, S., “Speech Synthesis Using HMMs with Dynamic Features”, in Proc. of ICASSP, pp. 389-392, 1996 [Monaghan, 1991] Monaghan, A.I.C. and Ladd, D.R., “Manipulating Synthetic Intonation for Speaker C haracterisation”, in Proc. of ICASSP, S7.11, pp. 453-456, 1991 [Pan, 2000] Pan, N. H., Jen, W. T., Yu, S. S., Yu, S. S., Huang, S. Y. and Wu, M. J., “Prosody Model in a Mandarin Text-to-Speech System Based on a Hierarchical Approach”, IEEE International Conference on Multimedia and Expo, vol. 1, pp. 448-451, 2000 [Rissanen, 1984] Rissanen, J., “Universal Coding, Information, Prediction, and Estimation”, IEEE Trans. on IT, vol. 30, no. 40, pp. 629-636, 1984 [Shinoda, 1997] Shinoda, K. and Watanabe, T., “Acoustic modeling based on the MDL criterion for speech recognition”, in Proc. of EuroSpeech, vol. 1, pp. 99-102, 1997 [Sun, 2002] Sun, X., The Determination, Analysis and Synthesis of Fundamental Frequency, Ph. D Thesis, Northwestern University, 2002 [Tao, 2004] Tao, J., “F0 Prediction Model of Speech Synthesis Based on Template and Statistical Method”, Lecture Nodes of Artificial Intelligence, Springer, 2004 [Tokuda, 1995] Tokuda, K., Kobayashi, T. and Imai, S., “Speech Parameter Generation from HMM Using Dynamic Features”, in Proc. of ICASSP, pp. 660-663, 1995 [Tokuda, 2000] Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T. and Kitamura, T., “Speech Parameter Generation Algorithms for HMM-based Speech Synthesis”, in Proc. of ICASSP, pp. 1315-1318, 2000 [Tseng, 2004] Tseng, C.Y. and Lee, Y. L., ”Speech rate and Prosody Units: Evidence of Interaction from Mandarin Chinese”, in Proc. of the International Conference on Speech Prosody, pp. 251-254, 2004 [Tseng, 2005] Tseng, C. Y., Pin, S. H., Lee, Y. L., Wang, H. M. and Chen, Y. C., “Fluent Speech Prosody: Framework and Modeling”, Speech Communication, Special Issue on Quantitative Prosody Modeling for Natural Speech Description and Generation, Vol. 46: 3-4, pp. 284-309, 2005 [Wightman, 1994] Wightman, C. W. and Ostendorf. M., “Automatic Labeling of Prosodic Patterns”, IEEE Trans. on Speech and Audio Processing, vol. 2, no. 4, pp. 469-481, October 1994 [Yi, 2001] Yi, X. and Wang Q. E., “Pitch Targets and Their Realization: Evidence from Mandarin Chinese”, Speech Communication, pp. 319-337, 2001 [Young, 2006] Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X.Y., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P., The Hidden Markov Model Toolkit (HTK) Version 3.4, 2006. http://htk.eng.cam.ac.uk/ [Zen, 2007] Zen, H., Nose, T., Yamagishi, J., Sako, S. and Tokuda, K., The HMM-based Speech Synthesis System (HTS) Version 2.0, 2007. http://hts.sp.nitech.ac.jp/ [謝, 民63年] 謝雲飛, 語音學大綱, 民國63年初版
|