|
A Text independent speaker identification system based on long-term spectral feature averaging (LTA) and Karhunen-Loeve transform (KLT) for telephone speech is proposed. This system uses basis function derived from KLT to effectively reduce the data volume and preserve most of the identification information for each speaker. A database with 61 male and 72 female mandarin speakers recorded from the telephone answering system are collected for system evaluation. By the use of the first 28 features from the 128 basis functions, it is demonstrated that the correct classification rate can reach 88% if our special frame selection criterion is used, and correct classification rate is 73% only if the special frame selection criterion is not used! And the same time the classification time can be reduced to about 3% of the classification time with 128 features!
|