|
In this thesis, several topics of signal processing for Mandarin speech and singing signal are studied. First, the technique of processing some utterances of a female speaker in order to add her voice to a Mandarin TTS system is studied. In this work, we first automatically segment all utterances into syllable segments, and then manually extract waveform templates of 411 synthesis units of base-syllable. Some prosodic parameters are then extracted and used to calculate their first- and second-order statistics in order to adapt the prosodic information synthesis to the style of the new speaker. Second, an LP model-based speech processing technique is proposed. The input speech signal is processed to shift the key, to change the speed, to simulate the echo effect, and to make a spectral transform. Last, a time-domain processing scheme for singing signal is discussed. The input singing signal is processed to shift the key, to change the speed, and to simulate the echo effect. Informal listening tests confirmed that all these proposed methods function well
|