【1】 V. Kraft, “Does the Resulting Speech Quality Improvement Make a Sophisticated Concatenation of Time-Domain Synthesis Units Worthwhile?” Proceedings of the Second ESCA/IEEE Workshop on Speech Synthesis, New Paltz, NY, pp65-68.
【2】 王小川教授,”語音信號處理”
【3】 陳鳳儀, 蔡碧芳, 陳克健, 黃居仁, “中文句結構樹資料庫(Sinica Treebank)的構建”,中央研究院資訊所、中央研究院研究所。
【4】 Klatt, D. H. (1987) Review of text-to-speech conversion for English. J. Acoust. Soc. Amer, 82(3), pp.737-793.
【5】 Hamon, C., E. Moulines, and F. Charpentier (1989), “A diphone synthesis based on time-domain prosodic modifications of speech” in Proc. ICASSP, pp.238-241.
【6】 Chen, S.H., S.H. Hwang and Y. R. Wang(1998), “An RNN-based prosodic information Synthesizer for Mandarin text-to-speech,” IEEE Trans. On Speech and Audio Processing, Vol. 6, NO. 3, pp.226-239.
【7】 Chen, J. H. (1998) A Study on Synthesis Unit Selection and Prosodic Information Generation in a Chinese Text-to-Speech. Ph.D. Dissertation. National Cheng Kung University, Tainan, Taiwan, R.O.C.
【8】 Shih, C. L.and R. Sproat (1996), “Issues in text-to-speech conversion for Mandarin” in Computational Linguistics and Chinese Language Processing, vol. 1, Aug. 1996, pp.37-86.
【9】 Iwahashi, N. and Y. Sagisaka (1995), “Spech segment network approach for optimization of synthesis unit set,” Computer Speech and Language, pp.335-352.
【10】 Chiou, H. B., H. C. Wang, and Y. C. Chang (1991), “Synthesis of Mandarin speech based on hybrid concatenation,” Computer Processing of Chinese and Oriental Languages, Vol. 5, No. 3/4, pp. 217-231.
【11】 Chou, F. C. and C. Y. Tseng (1998),”Corpus-based Mandarin speech synthesis with contextual syllabic units based on phonetic properties” in Pro. ICASSP, pp.893-896.
【12】 林立峰,”中文TTS系統與音合成之改進”,國立交通大學碩士論文,民國九十三年六月。【13】 The HTK Book (for HTK Version 3.2)
【14】 魯弘茂,”中文語音合成技術之實作與分析”,國立交通大學碩士論文,民國九十一年六月。【15】 江振宇,”中文斷詞器之改進”, 國立交通大學碩士論文,民國九十三年六月。【16】 W.J. Wang, W.N. Campbell, N. Iwahashi, and Y. Sagisaka, “Tree-based unit selection in speech synthesis,” in Proc. Of the Int’l Conf. on Aoustics, Speech, and Signal Processing, Vol. II, pp.191-194, 1993.
【17】 A.J. Hunt and A.W. Black, “Unit selection in a concatenative speech synthesis system using a larger speech databse,” in Proc. ICASSP, Atlanta, 373-376, 1996.
【18】 H. Peng, Y. Zhao, and M. Chu, “Perpetually optimizing the cost function for unit selection in a TTS system with one single run of MOS evaluation,” in Proc. ICSLP, Denver, USA, 2002.
【19】 T. Toda, H. Kawai, M. Tsuzaki, and K. Shikano, “Unit Selection Algorithm for Japanese Speech Synthesis Based on Both Phoneme Unit and Diphone Unit,” in Proc. of IEEE-ICASSP 2002, pp.465-468, May 2002.
【20】 Chou, F. C., C. Y. Tseng, and L. S. Lee, “A Set of Corpus-Based Text-to-Speech Synthesis Technologies for Mandarin Chinese” in Pro. ICASSP, Vol. 10, pp.481-494, 2002.
【21】 Min Chu and Hu Peng, “An Objective Measure for Estimating MOS of Synthesized Speech” in EuroSpeech 2001.