研究生(外文):Teng-Fu Shih
論文名稱(外文):The Mandarin monosyllable recognition by using the method of K-nearest neighbor with different weights
外文關鍵詞:K-nearest neighbor with different weightsMFCC
本篇論文主要是探討非特定語者對於 1391 個中文單音在不分、分聲調、母音與子音之辨識率。實驗主要分成幾部分:第一將錄製好的語音資料進行前處理,第二利用梅爾頻率倒頻譜求取特徵值,第三建立語音模型並使用權重式K最近鄰居法,從中選出辨識率最好的組合,視為最佳結果。本次實驗的語音資料庫是由二十位不同語者錄製共278200個語音資料。本篇固定的特徵值維度為39,取樣點為256,子、母音音框分別為20、25。實驗結果發現,在同母音群中,母音、子音權重分別為7比3時,其母音辨識率最佳為95.10%。在同母音且同子音群中,母音、子音權重分別為7比3時,其母音辨識率為94.40%。在母音對再辨識子音方面,其母音辨識對的前提下辨識單音的辨識率為87.4%。而在同母音且同子音群中,母音、子音權重分別為5比5時,不分聲調單音辨識率為80.73%。
This paper mainly discusses the recognition rate of non-specific speakers for 1391 mandarin tones in non-tone, sub-tone, vowel and consonant.
The identification process is mainly divided into several parts: the first part of recorded voice data will be fore-processing, the second part of find the eigenvalue by Melton frequency cepstrum, the third part of building voice model and the use of K-nearest neighbor of different weights, select the most recognition rate as the result of optimization. The speech database of this experiment is composed of twenty different speakers, total of 278200 voice data. In this paper, it fixed dimension of eigenvalue is 39, the sampling point is 256, the consonant frame is 20 and vowel frame is 25. The experimental results show that the best consonant error and vowel error are obtained for each vowel group and multiply the weight, and the vowel recognition rate is 95.10% when the vowel and consonant weights are 7:3 respectively. In addition, in each mandarin to find the best consonant, vowel multiplied by the weight, when the vowel, consonant weights are 7: 3, the vowel recognition rate of 94.40%, and in each mandarin to find the best consonant, vowel multiplied by the weight, when the vowel, consonant weight are 5:5 non-tone best recognition rate of 80.73%.
