[1]蔡松?,GMM為基礎之語音轉換法的改進,國立台灣科技大學資訊工程所碩士論文,2009。
[2]王讚緯,使用值方圖等化及目標音框挑選之語音轉換系統,國立台灣科技大學資訊工程所碩士論文,2014。[3]張家維,使用主成分向量投影及最小均方對映之語音轉換方法,國立台灣科技大學資訊工程所碩士論文,2012。[4]H.Valbret, E. Moulines, and J.P. Tubach, “Voice transformation using PSOLA technique,” in 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-92, vol. 1, San Francisco, CA, USA, 23-26 Mar. 1992, pp. 145-148.
[5]D. Erro, E. Navas, and I. Hernaez, “Parametric voice conversion based on bilinear frequency warping plus amplitude scaling,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 3, pp.556-566, 2013.
[6]X. Tian, Z. Wu, S.W. Lee, and E.S. Chng, “Correlation-based frequency warping for voice conversion,” In 2014 9th International Symposium on Chinese Spoken Language Processing, ISCSLP, Singapore, 12-14 Sept. 2014, pp. 211-215, IEEE.
[7]M. Narendranath, H.A. Murthy, S. Rajendran, and B. Yegnanarayana, “Transformation of formants for voice conversion using artificial neural networks,” Speech Communication, vol. 16, no. 2, pp. 207-216, 1995.
[8]F.L. Xie, Y. Qian, Y. Fan, F. K. Soong, and H. Li, “Sequence error(SE) minimization training of neural network for voice conversion,” Interspeech, pp. 2283-2287, 2014.
[9]Y. Stylianou and O. Cappe and E. Moulines, “Continuous probabilistic transform for voice conversion,” IEEE Transaction on Speech and Audio Processing, Vol. 6, No. 2, pp. 131-142, 1998.
[10]T. Toda, A. W. Black, and K. Tokuda, “Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory,” IEEE Transactions on Speech and Audio Processing, Vol. 15, No. 8, pp. 2222-2235, 2007.
[11]T. Dutoit, A. Holzapfel, M. Jottrand, A. Moinet, J. Perez, and Y. Stylianou, “Toward a voice conversion system based on frame selection,” in 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, vol. 4, Honolulu, HI, 15-20 Apr. 2007, pp. 513-516.
[12]H.Y. Gu and S.F. Tsai, “A voice conversion method combining segmental GMM mapping with target frame selection”, Journal of Information Science and Engineering, Vol. 31, No. 2, pp. 609-626, 2015.
[13]蔡仲明,基於GMM及PPM模型的國、閩南、客語之語言辨識,國立台灣科技大學資訊工程所碩士論文,2007。[14]S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK book(for HTK version 3.2.1), Cambridge University Engineering Department, 2002.
[15]K. Sjolander and J. Beskow, Centre of Speech Technolodge at KTH, Available: http:// www.speech.kth.se/wavesurfer/.
[16]吳昌益,使用頻譜演進模型之國語語音合成研究,國立台灣科技大學資訊工程所碩士論文,2007。[17]K. Sayood, Data Compression, 2nd ed., San Francisco, CA: Morgan Kaufmann Publishers, 2000.
[18]W.J. Teahan, “Probability estimation for PPM,” in New Zealand Computer Science Research Student Conference, NZCSRSC'95, Apr. 1995.
[19]L. Rabiner and B.H. Juang, Fundamental of speech recognition, NJ, USA: Prentice Hall, 1993.
[20]T. Caliński and J. Harabasz, “A dendrite method for cluster analysis,” Communication in Statistics, Vol. 3, No. 1, pp. 1-27, 1974.
[21]R.A. Redner and H.F. Walker, “Mixture densities, maximum likelihood and the EM algorithm,” SIAM Review, Vol. 26, No. 2, pp. 195-239, 1984.
[22]A. Kain, High resolution voice transformation, PhD dissertation, Oregon Health & Science University, 2001.
[23]T. Toda, A.W. Black, and K. Tokuda, “Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory,” IEEE Transactions on Audio, Speech, Language, Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
[24]洪尉翔,使用MGE訓練之HMM模型及全域變異數匹配之合成語音信號音質改進方法,國立台灣科技大學資訊工程所碩士論文,2015。[25]E. Godoy, O. Rosec and T. Chonavel, “Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, pp. 1313-1323, 2012.
[26]Y. Stylinaou, Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification, PhD Thesis, Ecole National Superieure des Telecommunications, 1996.