|
Due to the low Mandarin consonant recognition rate , in this thesis , a new recognition network named Dynamic Programming Bayesian Neural Network is proposed to improve the consonant recognition rate. In this network , we combine the dynamic model and Bayesian network which will overcome the problem of the dynamic feature of speech signal and tolerate spectral pattern variation. Furthermore , since the feature parameters selection is very important for consonant, we use cepstrum coefficient , delta cepstrum coefficient and time domain features such as energy and zero-crossing rate to represent the feature of consonant . In this paper, a hierarchical recognition scheme is used to reduce the matching times to accelerate recognition speed. Consequently , the number of candidates can be reduced from 408 to 64 , i.e. , 4 lexicals , 38 vowels ,22 consonants. We use a segmentation algorithm to segment the connected speech into syllables , and each syllables is then segmented into vowel and consonant parts . For experimental evaluation,we build a database composed of two males , each one utters 408 Mandarin syllables seven times . Four of them are used as training patterns and three for testing patterns . After a series of experiments , the top one recognition rate of 87.59% is achieved and the top five recognition rate is 98.51% in average .
|