(3.235.25.169) 您好!臺灣時間:2021/04/17 20:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:林子傑
研究生(外文):Tzu-Chieh Lin
論文名稱:利用Multiple Common Vector及Dynamic Time Warping於特定語者中文單音辨識
論文名稱(外文):Using the Method of Multiple Common Vector and Dynamic Time Warping to Recognize Isolated Mandarin Word for Speaker-Dependent System
指導教授:李宗寶
學位類別:碩士
校院名稱:國立中興大學
系所名稱:應用數學系所
學門:數學及統計學門
學類:數學學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:中文
論文頁數:40
中文關鍵詞:共同向量主成份分析動態時間軸校正法
外文關鍵詞:Common VectorPrincipal Component AnalysisDynamic Time Warping
相關次數:
  • 被引用被引用:3
  • 點閱點閱:110
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本篇論文主要是探討50個國字單音的辨識,首先利用主成份分析與共同向量的關係來建構出語音模型,之後在辨識比對的部分,我們將時間的因素考慮進去,所以試著加入動態時間軸校正法,觀察其能否提升辨識率;包括了動態時間軸校正法,本論文討論的其他四個實驗因子:「音框數」、「分群數」、「特徵向量個數」及「語音特徵參數」,希望能找出在何種情況下50個字能具有不錯的鑑別度。而本論文的實驗結果,辨識50個字時,最高辨識率可達97.33 %
This paper is to discuss the speech recognition of 50 isolated mandarin words. First, we use the relationship between principal component analysis and common vector to construct the speech model. Then we will take into account the time factor and attempt to join the dynamic time warping to improve the rate of recognition. Including dynamic time warping, we also consider the other four experimental factors in this paper: "the number of frame", "the number of cluster", "the number of eigenvector", and "speech feature extraction". We hope to find out which circumstances for the recognition of 50 words would be the best. And the maximum rate of recognition attains 97.33 % on the 50 words.
目錄
中文摘要...........................................................Ⅰ
英文摘要...........................................................Ⅱ
目錄...............................................................Ⅲ
圖目錄............................................................ Ⅴ
表目錄............................................................ Ⅵ
第一章 緒論..................................................................................................................1
1-1研究動機與目的........................................................................................1
1-2研究方向....................................................................................................1
1-3辨識流程概述............................................................................................2
1-3-1語音前處理.......................................................................................2
1-3-2求取特徵參數...................................................................................3
1-3-3訓練語音模型...................................................................................4
1-3-4比對辨識...........................................................................................5
1-4論文架構......................................................................................................5
第二章 語音訊號的前處理與特徵值的求取.............................................................7
2-1 前言..........................................................................................................7
2-2語音的前處理...........................................................................................7
2-2-1 數位化取樣.................................................................................7
2-2-2 常態化.........................................................................................8
2-2-3語音端點偵測..............................................................................9
2-2-4切割音框......................................................................................9
2-2-5預強調........................................................................................10
2-2-6視窗化........................................................................................10
2-3特徵值的求取………………………………………………..………...12
第三章 語音模型的建立與辨識方法........................................................................14
3-1 前言........................................................................................................14
3-2音框的壓縮與擴張.................................................................................14
3-3 K-means分群法......................................................................................15
3-4多重共同向量.........................................................................................17
3-4-1共同向量…………………………………………………………18
3-4-2共同向量與主成份分析的關係…………………………………18
3-5辨識的方法.............................................................................................19
3-5-1待測語音的處理……………………………………………........19
3-5-2 比對的方法………..…………………………….........................20
3-5-3動態時間軸校正法………….…………………………....……...20
第四章 實驗操作流程與實驗結果..........................................................................25
4-1操作介面.................................................................................................25
4-2實驗流程.................................................................................................25
4-2-1語音來源.......................................................................................25
4-2-2影響辨識率的可能因素………………………………………...25
4-2-3辨識結果………………………………………………………...26
第五章 結論與建議....................................................................................................37
參考文獻......................................................................................................................38
附錄..............................................................................................................................40


附圖目錄
圖1.1語音訊號前處理流程圖......................................................................................3
圖1.2求取特徵參數流程圖..........................................................................................3
圖1.3語音模型建立流程圖..........................................................................................4
圖1.4辨識流程圖..........................................................................................................5
圖2.1語音類比訊號圖..................................................................................................7
圖2.2語音數位訊號圖..................................................................................................7
圖2.3原始語音「光」的波形圖..................................................................................8
圖2.4經過常態化後,語音「光」的波形圖..............................................................8
圖2.5切割音框示意圖................................................................................................10
圖2.6視窗函數比較圖................................................................................................11
圖3.1 音框壓縮過程圖..............................................................................................14
圖3.2 音框擴張過程圖..............................................................................................15
圖3.3原始資料分佈圖................................................................................................16
圖3.4經K-means後的資料圖...................................................................................16
圖3.5多重共同向量法流程圖....................................................................................17
圖3.6理想狀況之動態程序........................................................................................21
圖3.7整體搜尋路徑限制............................................................................................22
圖3.8區域路徑限制一................................................................................................22
圖4.1特徵參數為「倒頻譜參數加差倒頻譜參數」之下不同因素影響之辨識率比較圖..............................................................................................................................33
圖4.2特徵參數為「倒頻譜參數」之下不同因素影響之辨識率比較圖...............34
圖4.3不同的特徵參數影響下之辨識率比較圖.......................................................35


附表目錄
表3.1 區域路徑限制圖..............................................................................................24
表4.1音框數=10,分群數=1之辨識率....................................................................27
表4.2音框數=10,分群數=2之辨識率....................................................................28
表4.3音框數=20,分群數=1之辨識率....................................................................29
表4.4音框數=20,分群數=2之辨識率......................................................................30
表4.5音框數=10,分群數=1之辨識率....................................................................31
表4.6音框數=10,分群數=2之辨識率....................................................................31
表4.7音框數=20,分群數=1之辨識率....................................................................32
表4.8音框數=20,分群數=2之辨識率......................................................................32
表4.9擴充字彙之辨識率............................................................................................36
[1]王小川(2004), “語音訊號處理”,台北市:全華。
[2]王國榮(2000),“Visual Basic 6.0 實戰講座 ”,台北巿:旗標。
[3]李宗寶,黎自奮,楊茗惠(2003),用隱藏式馬可夫方法於頻域特徵之國語數字辨識,碩士論文,國立中興大學應用數學系,台中。
[4]李宗寶,張國清(2005), “用K-means之動態時間軸校正法於國語數字之語音辨識”,碩士論文,國立中興大學應用數學研究所,台中。
[5]李宗寶,吳宗憲(2005), “探討K-means之共同向量法應用於國語數字辨識”,碩士論文,國立中興大學應用數學研究所,台中。
[6]李宗寶,林靖剛(2006), “利用Multiple Common Vector 於國語數字之語音辨識”,碩士論文,國立中興大學應用數學研究所,台中。
[7]吳明哲,黃世陽(1998), “Visual Basic 6.0 中文版學習範本”,台北市:松崗。
[8]Angm, H.(1995), “Common vector obtained from linearly independent speech vectors by using LPC parameters,” graduation project, Elect. Electron. Eng. Dept., Osmangazi Univ., Eskisehir, Turkey.
[9]Bing, X. and Yihe, S. (1996), “Research on ASIC for multi-speaker isolated word recognition”, ASIC, 2nd International Conference, 21-24, 135-137.
[10]Bourouba, H., and Bedda, M. (2004), “HybridapproachDTW/HMMC for the recognition of the isolated Arabic words”, Information and Communication Technologies, 2004 International Conference on, 19-23, 481-482.
[11]Chu, Myung-Kyung, and Sohn, Young-Sun (2001), “A User Friendly Interface Operated by the Improved DTW Method”, The 10th IEEE International Conference , 3, 2-5, 1187-1190.
[12]Gulmezoglu, M. B., Dzhafarov, V. and Barkana, A.(1999), “ A novel approach to isolated word recognition”, IEEE Trans. On Speech and Audio Processing, vol. 7. No. 6.
[13]Gulmezoglu M. B., Dzhafarov, V. and Barkana, A. ,“The common vector approach and its relation to principal component analysis”, IEEE Trans. On Speech and Audio Processing, vol. 9. No. 6
[14]Harb, H., and Husseiny, A.H. (2000), “Isolated words recognition using neural networks”, The 7th IEEE International Conference on, 1, 17-20, 349-351.
[15]Keskin, M., Gulmezoglu, M. B., Parlaktuna, O. and Barkana, A. (1996), “Isolated word recognition by extracting personal differences,” in Proc. 6 th Int.Conf. Signal Processing Applications and Technology, Boston, MA , pp.1989-1992.
[16]Li, T. F. (2003), “Speech recognition of mandarin monosyllables”, Pattern Recognition 36, 2713-2721.
[17]Rabiner, L.R. and Sambur, M.R.(1975), “An algorithm for determining the endpoints of isolated utterances”, The Bell System Technique Journal, Vol.54, pp.297-315.
[18]Rabiner, L.R. and Schmidt, C.E.(1980), “Application of Dynamic Time Warping to Connected Digital Recognition,” IEEE Transactions on Acoustics, Speech,and Signal Processing, Vol. 28, pp. 377-388.
[19]Sakoe, H. and Chiba, S.(1978), “Dynamic Programming Optimization for Spoken Word Recognition,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 26, pp. 43-49.
[20]Yucel, S.(1996), “Application of Gram-Schmidt orthogonalization method to speech recognition for different noise levels” graduation project, Elect. Electron. Eng. Dept., Osmangazi Univ., Eskisehir, Turkey.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊
 
系統版面圖檔 系統版面圖檔