(18.206.12.76) 您好!臺灣時間:2021/04/23 09:18
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:黃浩銘
論文名稱:語音及其小波重建訊號的特徵參數散佈之觀察
論文名稱(外文):The Observation of the Feature Parameter Distribution of the Speech Utterance Signals and the Wavelet Reconstructed Sub-signals
指導教授:蕭肇殷
口試委員:汪島軍林忠逸
口試日期:2014-06-23
學位類別:碩士
校院名稱:逢甲大學
系所名稱:機械與電腦輔助工程學系
學門:工程學門
學類:機械工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:102
語文別:中文
論文頁數:121
中文關鍵詞:小波轉換特徵參數語音識別
外文關鍵詞:Wavelet transformFeature parametersSpeech recognition
相關次數:
  • 被引用被引用:1
  • 點閱點閱:75
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:0
本研究將語音訊號,經小波轉換及資料壓縮後,整理成為十三組特徵型態。用以觀察這十三組特徵相量在高維度空間之散佈情況,及作為語音辨識鑑別度的評估。將錄製好的特定語者的特定語音訊號,將每筆語音訊號透過前處理,並以四層的小波轉換將每筆語音訊號分解並重建成四組近似值子訊號以及四組細節子訊號,將其這八組子訊號以及自身的原始語音訊號,以線性預測編碼壓縮轉換成該組子訊號以及原始語音訊號的九組特徵向量,並將原始語音訊號的特徵向量各別減去近似值子訊號的特徵向量,加以整理成為十三組特徵型態。對於特定語者特定語音的十三組特徵型態,而每一組的特徵型態分佈情況,可以當成高維度的群聚空間散佈,然而對每一組特徵型態形成一個共變異矩陣並計算每一組特徵型態的平均值,對其共變異矩陣的特徵值和特徵向量去建構一個超橢圓球,而以每一個超橢圓球間的中心距離以及標準差距離,來評估特定語者特定語音訊號的十三組特徵型態的各小組群聚的散佈情況。
In this study, after wavelet transform and data compression, we transfer the voice signals into thirteen feature types. Those are used for observing the distribution of those feature vectors in the high-dimensional space, and as the discrimination assessment for speech recognition. After pre-processing the recorded speaker dependent speech signals, we adopt the level four Db4 wavelet transformation to transfer the original speech signals into the eight reconstructed sub-signals, then compress those original signals and the eight sub-signals into the linear prediction coding (LPC) to formulate the night groups of feature vectors. And use the feature vectors of the original speech signals to substract those of the approximate sub-signals to get the 13 groups of feature vectors. Those are used for observing the distribution of these clusters of feature vectors. We regroup the speaker dependent speech words into several sets. For each set, we transfer the speech signals and the sub-signals into 13 feature vectors, and finely regroup those into 13 sub-sets, and looked at each sub-set as a cluster scattered in the high dimensional pattern space. For each cluster, we calculate the mean and the covariance matrix, and use the eigenvalues and eigenvectors of the covariance matrix to construct a hyper ellipsoidal. We then calculate the center distances and the contour distances of those hyper ellipsoidals and then used those to assess the distribution of the clusters of the feature vectors.
致謝............................................................................................................. I
摘要............................................................................................................ II
ABSTRACT..............................................................................................III
目錄...........................................................................................................IV
圖目錄.......................................................................................................VI
表目錄....................................................................................................VIII
符號說明................................................................................................XIII
第一章緒論...............................................................................................1
1.1前言.......................................................................................1
1.2文獻回顧..............................................................................2
1.3研究動機..............................................................................3
1.4本文架構..............................................................................3
第二章訊號前處理流程圖......................................................................4
2.1語音訊號前處理..................................................................4
2.3時間尺度調整......................................................................6
2.4小波轉換............................................................................10
2.5振幅.....................................................................................14
2.6音框.....................................................................................15
2.7線性預測編碼....................................................................16
第三章 群聚散佈.....................................................................................23
3.1堆疊的協方差共變異矩陣................................................23
3.2群聚散佈以及超橢圓球....................................................24
3.2-1群聚散佈.................................................................24
3.2-2群聚之間的距離.....................................................26
第四章實驗結果與討論........................................................................31
4.1實驗方法及設備................................................................31
4.2群聚散佈............................................................................35
4.2-1第一位語者的群聚散佈情況.................................35
4.2-2第二位語者的群聚散佈情況.................................88
4.2-3第三位語者的群聚散佈情況.................................93
4.2-4第四位語者的群聚散佈情況.................................98
4.2-5第五位語者的群聚散佈情況...............................103
4.2-6第六位語者的群聚散佈情況...............................108
4.3實驗結果與討論..............................................................113
第五章結論...........................................................................................116
參考文獻.................................................................................................117
[1] Furui, S, 2009. “Selected Topics From 40 Years of Research in Speech and Speaker Recognition”. In: INTERSPEECH-2009,pp. 1–8.
[2] Dan Jurafsky, Lecture 5: ”Feature Extraction and Acoustic Modeling”, LSA352 Speech Recognition And Synthesis Summer 2007.
[3] Rabiner, L., Juang, B.-H., 1993. Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs.
[4] Kuhn, R., Junqua, J.-C., Nguyen, P., Niedzielski, N, 2000. Rapid Speaker Adaptation in Eigenvoice Space. IEEE Transactions on Speech and Audio Processing 8 (6), 695–707.
[5] Gibson JD, Koo B. Filtering of Colored Noise for Speech Enhancement and Coding. IEEE Trans Signal Process 1991;39:1732–42.
[6] Lim JS, Oppenheim AV, Braida LD. Evaluation of an Adaptive Comb Filtering Method for Enhancing Speech Degraded By White Noise Addition. IEEE Trans Acoust Speech Signal Process 1978;ASSP- 26:354–8.
[7] McAulay RJ, Malpass ML. Speech Enhancement Using A Soft Decision Noise Suppression Filter. IEEE Trans Acoust Speech Signal Process 1980;ASSP-28:137–45.
[8] Boll SF, Pulsipher DC. Suppression of Acoustic Noise in Speech Using Two Microphone Adaptive Noise Cancellation. IEEE Trans Acoust Speech Signal Process 1980;ASSP-28:752–3.
[9] Berouti M, Schwartz R, Makhoul, J. Enhancement of Speech Corrupted By Acoustic Noise, in Proc.IEEE ICASSP, Washington, DC; 1979. p. 208–11.
[10]Ephraim Y, Malah D. Speech Enhancement Using A Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator. IEEE Trans Acoust Speech Signal Process 1984;ASSP-32:1109–21.
[11]Rioul OM, Vetterli V. Wavelets and Signal Processing. IEEE Trans Signal Process Mag 1991:14–38.
[12]Long CJ, Datta S. Wavelet Based Feature Extraction for Phoneme Recognition. Proceeding of the fourth international conference of spoken language processing, Philadelphia, USA, vol. 1. p. 264-7.
[13]Lung S-Y. Improved Wavelet Feature Extraction Using Kernel Analysis for Text Independent Speaker Recognition. Digital Signal Process2010;20:1400-7.
[14]Lung S-Y, Chen C-C. Further Reduced Form of Karhunen-Loeve Transform for Text Independent Speaker Recognition. Electron Lett 1998;34:1380–2.
[15]Mallat S. A Wavelet Tour of Signal Processing. San Diego, CA: Academic Press; 1998.
[16]Vetterli M, Kovacevic J. Wavelets and Sub Band Coding. Englewood Cliffs, NJ: Prentice Hall; 1995.
[17]Chun-Lin, Liu, “A Tutorial of The Wavelet Transform” http://disp.ee.ntu.tw/tutorial/WavetTutorial.pdf. February 23, 2010.
[18]Xueying Zhang, Yueling Guo, Xuemei Hou, &;quot;A Speech Recognition Method of Isolated Words Based On Modified LPC Cepstrum,&;quot; grc, pp.481, 2007 IEEE International Conference on Granular Computing (GRC 2007).
[19]Xiaofei Ji, Jiangtao Cao, Yibo Li, “Design of Speech Lock System Based on RBF Neural Network and Virtual Instrument Technology” Proceedings of the 2006 IEEE International Conference on Mechatronics and Automation June 25 - 28, 2006, Luoyang, China.
[20]Chao Yin Hsiao, Chin Kun Teng, Po Shih Hsu,” Ellipsoidal Function Modulated ART Neural Networks for Pattern Recognition,” IS3C 2012. IEEE 2012 International Symposium on Computer, Consumer and Control , Taichung, Taiwan, pp 401-404 Jun 4-6 2012.
[21]Chao Yin Hsiao, Jin Xing Li, Chin Kun Teng , Po Shih Hsu “Radial Basis Function Embedded ART Neural Networks For Pattern Recognition, ” Proceedings of 2011 International Conference on Service and Interactive Robots, Nov.25-27, 2011.
[22]楊東霖 ,“使用徑向基及適應性共振理論之複合式類神經網路於語音辨識” ,逢甲大學機械與電腦輔助工程學系碩士論文,2010年12月。
[23]小波十講/(美)多布(Daubechies,I.)著;李建平,楊萬年譯-北京:國防工業出版社,2004(2005.9重印),書名原文(Ten Lectures On Wavelets).ISBN 7-118-03381-2.
[24]數值泛涵與小波理論/馮象初等編著.-西安;西安電子科技大學出版社(研究生系列教材),2002.3.ISBN 7-5606-1193-1.
[25]Wavelet Toolbox User’s Guide .Ó COPYRIGHT 1996 - 1997 by The MathWorks, Inc. All Rights Reserved. Michel Misiti, Yves Misiti, Georges Oppenheim, Jean-Michel Poggi.
[26]李進興 ,“使用自適應共振理論網路與小波包能量特徵參數於影像辨識” ,逢甲大學機械與電腦輔助工程學系碩士論文,2010年6月。
[27]劉忠陽 ,“語音辨識系統-連續密度隱藏式馬可夫模式之研究” ,私立逢甲大學自動控制工程研究所碩士論文,1990年6月。
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊
 
系統版面圖檔 系統版面圖檔