跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.59) 您好!臺灣時間:2025/10/17 06:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:劉民洋
研究生(外文):Ming-Yang, Liu
論文名稱:語音環境控制輔具系統之單字切割研究
論文名稱(外文):Word Boundary Detection Analysis of Speech Environment Control Auxiliary System
指導教授:陶金旭
指導教授(外文):Jin-Shiuh Taur
學位類別:碩士
校院名稱:國立中興大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:中文
論文頁數:87
中文關鍵詞:語音辨識單字切割梅爾倒頻譜參數隱藏式馬可夫模型
外文關鍵詞:speech recognitionword boundaryMFCChidden markov model
相關次數:
  • 被引用被引用:0
  • 點閱點閱:225
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:3
語音辨識在最近數十年已被廣泛的研究,本計畫為本校與國科會以及中山醫大的輔具合作計畫,完成一個中文聲控選單驅動之環境控制系統。此一系統可使行動不便的病患能夠藉由人機介面的設計,以語音的方式就可以控制家裡的家電,使得操作較為便利,並達到擴大病人居家生活的獨立性。
本論文的研究內容為中文連續語音切割,比較Elman類神經網路與大多數人使用的能量及越零率參數法,看是否在雜訊的環境中,單字切割的效果有改善。接下來就是把切割好的單字以MFCC汲取出語音的特徵參數,以連續型隱藏式馬可夫模型,分別建立各個中文單字的模型參數,最後在實際環境中做測試,實驗結果可知,在無背景雜訊下辨識率可達95%左右,而噪音環境下的測試則是因環境不同而辨識率也不同,大約有80%。
Speech recognition has been an intensively researched topic at the recent decades. This is a collaboration project between National Sciences Council, Chung Shan Medical University and our university, to deliver a Menu Based Chinese Pronunciations Control System. This system will facilitate those patients with movement difficulties via the design of the human machine interface. They can control the household electrical appliances by using the audio speech, this make the operations of the electrical appliance easier. And also promote the living life-style of the patients to be more independent.
The focuses of the research are on the segmental of continuous Chinese pronunciation, the comparison the Elman Neural Network method with the popular usage of energy and zero crossing parameter method, and determine the improvement in the segmentation of a single word, under a noisy environment.
The MFCC method is used to derive the characteristic parameter of pronunciations. The Hidden Markov model is used to set pattern parameter for each Chinese word respectively. At last, a testing at a real-world environment was conducted. The outcomes of the testing, showing that, the percentage of recognition under a noise-free environment will be about 95%, while at a noisy environment the percentage of recognition will be about 80%.
目錄
第一章 緒論..........................................1
1.1研究動機與目的 ....................................1
1.2本論文語音辨識系統概述.............................4
1.3 論文架構..........................................7
第二章 單字切割.......................................8
2.1 單字切割介紹......................................8
2.2 無背景雜訊下的單字切割............................9
2.2.1 能量參數(Energy)...............................10
2.2.2越零率(zero crossing rate)......................11
2.2.3 能量與越零率應用的例子.........................14
2.3 雜訊環境下的單字切割.............................16
2.3.1 RTF 參數.......................................17
2.3.2 Elman類神經網路簡介............................25
第三章 特徵參數汲取..................................29
3.1 聲音介紹.........................................29
3.2 語音特徵汲取.....................................31
3.3 MFCC.............................................32
3.3.1 訊號的前置處理(Preemphasize)...................33
3.3.2 上窗(Windowing)...............................34
3.3.3頻譜分析........................................37
3.3.4濾波器組處理(Filter bank processing)............38
3.3.5對數能量計算(Log energy computation)............40
3.3.6 離散餘弦轉換(Discrete cosine Transform)........40
3.3.7能量及差異值係數(energy and Delta coefficients).41
第四章 語音辨識模型與馬可夫模型理論..................43
4.1 前言.............................................43
4.2 隱藏式馬可夫模型之建立...........................43
4.3 機率計算.........................................47
4.3.1前算程序(The Forward Procedure)................48
4.3.2後算程序(The Backward Procedure)...............50
4.3.3前算與後算的遞迴................................51
4.4 Viterbi演算法....................................52
4.5 參數重估(Parameter Reestimation).................54
4.6 語音模型的建立 ...................................56
第五章 實驗結果與比較................................58
5.1 系統介紹.........................................58
5.1.1家電控制命令辨識................................58
5.1.2人機介面視窗程式設計............................59
5.1.3連續中文字的單字切割............................60
5.1.4 Elman類神經網路的訓練、測試與驗證..............62
5.1.5 實驗結果與比較.................................67
5.2 訓練隱藏式馬可夫模型參數.........................74
5.2.1 語音辨識程序 ...................................75
5.2.2 無雜訊環境下的語音辨識率討論...................76
5.2.3 雜訊環境下的語音辨識率討論.....................78
第六章 結論與未來展望................................83
6.1 結論.............................................83
6.2 未來展望.........................................83
參考文獻.............................................84
[1]S.B. Davis and P. Mermelstein,“Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Trans on ASSP, Vol.28, No.4, pp357-366, Aug. 1980.
[2]Juang B. H.,“On the Hidden Markov Model and Dynamic Time Warping for Speech Recognition-A Unified View, ”AT&T B.S.T. Jvol. 63 , no.7 Sep. 1984.
[3]Gold B.,Lippmann R.,“A Neural Network for Isolated-word Recognition, ”ICASSP-88, pp. 44-47,1988.
[4] L.R. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Prentice Hall, 1978.
[5] Gin-Der Wu; Chin-Teng Lin, “A recurrent neural fuzzy network for word boundary detection in variable noise-level environments”, Systems, Man and Cybernetics, Part B, IEEE Transactions on ,Volume: 31 ,Issue: 1 ,Feb 2001.
[6] Gin-Der Wu; Chin-Teng Lin, “Word boundary detection with mel-scale frequency bank in noisy environment”, Speech and Audio Processing, IEEE Transactions on ,Volume: 8 ,Issue: 5 , Sept. 2000.
[7]Chin-Teng Lin; Jiann-Yow Lin; Gin-Der Wu, “A robust word boundary detection algorithm for variable noise-level environment in cars”, Intelligent Transportation Systems, IEEE Transactions on ,Volume: 3 ,Issue: 1 ,March 2002.
[8] Junqua, J.-C.; Mak, B.; Reaves, B, “A robust algorithm for word boundary detection in the presence of noise”, Speech and Audio Processing, IEEE Transactions on ,Volume: 2 ,Issue: 3 ,July 1994.
[9] 羅華強編著,類神經網路MATLAB的應用,清蔚科技。
[10] 黃朝元,”隱藏式馬可夫模型於聲控選單環境控制系統之應用”,國立中興大學/電機工程學系/90/碩士/90NCHU0442009.
[11] John R. Deller,Jr. , John G. Proakis, and John H. L. Hansen “ Discrete-Time Processing of Speech Signals”, New Jersey, Prentice Hall, 1987.
[12] 蒙以正,以MATLAB透視DSP,基峯資訊。
[13] Bocchieri E., Wilpon J.G.,“Discriminative feature selection for speech recognition,”Computer Speech and Language, 7, pp. 229-246, 1993.
[14] Claudio Becchetti, Lucio Prina Ricotti,“Speech Recognition Theory and C++ Implementation”,Speech signal analysis, 3,pp. 122-134,1999.
[15] L. E. Baum, T. Petrie, G. R. Soules, and N. Weiss, “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains”,Ann.Math.,Stat., vol. 41, no.1, pp 164-171, 1970.
[16] Rabiner L. R., Juang B. H., Fundamentals of Speech Recognition, Prentice Hall, 1993
[17] Xuang X. D., Ariki Y., Jack M.A.,“Hidden Markov Models for Speech Recognition”, Edinburgh University Press, chap 7, pp. 187-205, 1990
[18]Yummin Lee, Lin-Shan Lee,“Continuous Hidden Markov Models integrating transitional and instantaneous features for Mandarin syllable recognition”,Computer Speech and Language, vol 7l, pp. 247-263, 1993
[19] Viterbi A. J.,“Error bounds for convolution codes and an asymptotically optimal decoding algorithm”, IEEE Trans. Information Theory, IT-13:260-269, April 1967.
[20] Forney G. D., “The Viterbi algorithm”, Proc. IEEE, 61:268-278, March 1973.
[21] 陳信宏,謝寶華(2000)。“使用前後文相關HMM 模型之國語連續語音辨認”。碩士論文,國立交通大學電信工程系,新竹。
[22] 楊鎮光(2002),“Visual Basic 與語音辨識-讓電腦聽話”,台北巿:
松崗。
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊