跳到主要內容

臺灣博碩士論文加值系統

(44.220.247.152) 您好!臺灣時間:2024/09/15 08:23
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王泰隆
研究生(外文):Tai-Lung Wang
論文名稱:語音基頻偵測之研究
論文名稱(外文):A Study on Pitch Detection of Speech Signals
指導教授:魏清煌
指導教授(外文):Ching-Huang Wei
學位類別:碩士
校院名稱:國立高雄第一科技大學
系所名稱:電腦與通訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:英文
論文頁數:73
中文關鍵詞:基頻偵測小波轉換
外文關鍵詞:Pitch DetectionWavelet Transform
相關次數:
  • 被引用被引用:1
  • 點閱點閱:375
  • 評分評分:
  • 下載下載:48
  • 收藏至我的研究室書目清單書目收藏:0
本論文主要目的在研究語音基頻偵測方法。由於基頻為存在於有聲語音中一個相當重要的特徵,在許多語音處理應用,諸如語音編解碼、語者辨識、發音校正及發聲器官病變診斷,都需要使用正確的基頻特徵。在過去數十年的研究,雖然提出了很多種基頻偵測方法,但在準確性及對雜訊的強健性仍有待改進。最近的基頻偵測技術為利用小波轉換特性來抽取語音信號的基頻,這也是本文的主要研究方向。

首先,我們模擬Chen和Wang所提出以小波轉換為基礎的基頻偵測方法,然後針對無法正確決定基頻的有聲語音提出改進的解決方法,稱之為方法一。在我們的研究中發覺對某些語音利用信號負值部分比一般傳統使用正值部分更能正確地求得基頻。此外,我們也改良了Shelby等人所提出的方法,稱之為方法二。

為了評估所提方法的準確性及強健性,將測試語音資料加入不同信號雜訊比的雜訊後,在各種情況下所得實驗結果與原始方法比較偵測錯誤率。從實驗數據觀察到方法一及方法二的基頻偵測錯誤率皆低於Chen和 Wang的方法與Shelby的方法。實驗結果驗證了我們對原始演算法所做的改進,對於正確地偵測語音信號基頻的確是有所助益。
The objective of this thesis is to study the pitch detection methods of speech signals. Pitch is a very important feature in voiced speech and it is widely used in speech processing applications such as speech coding, speaker recognition, phonetics, and voice disease diagnostics. Although there are many pitch detection methods have been proposed, the accuracy and robustness against noise are needed to further improve. Recently, the wavelet transform has been widely applied to the pitch detection of speech signals, so the wavelet-based pitch detection method is our research direction.

First, we simulate the Chen and Wang''s wavelet-based pitch detection method. We find that for some voiced speech if we detect the pitch from the negative portion of signal will be accurate than from the positive portion of signal. Besides, we simulate the method proposed by Shelby et al. It is the improved version of the first wavelet-based method proposed by Kadambe. In the thesis, the method modified from the Chen and Wang''s method is called method I, while the method improved from the Shelby''s method is called method II.

In order to evaluate the accuracy and robustness against noise for the pitch detection methods, we add white Gauss noise to our test data in different signal-to-noise ratio and compare the error rate of pitch detection with the original works. From the experimental results, the error rates of the method I and method II are lower than Chen and Wang''s and Shelby''s methods, respectively. It verifies that our modifications for the original methods are beneficial for the accurate pitch detection of speech signals.
Abstract (in Chinese) …………………………………i
Abstract (in English) …………………………………ii
Acknowledgment (in Chinese) ………………………iii
Contents………………………………………………iv
List of Abbreviations…………………………………vii
List of Figures ………………………………………viii
List of Tables …………………………………………x

Chapter 1 Introduction ……………………………1
1.1Motivation ……………………………………1
1.2Related Researches ……………………………3
1.3Thesis Organizations …………………………5

Chapter 2 Concepts of Speech Processing …………6
2.1Introduction ……………………………………6
2.2Model of Speech Production ……………………7
2.3Properties of Speech Signals ……………………9
2.4Short-time Analysis of Speech Signals …………15

Chapter 3 Review of Wavelet Theory …………………17
3.1Introduction ………………………………………17
3.2Properties of Wavelet Analysis ……………………18
3.3Introduction to Wavelet Transform …………………20
3.3.1Continuous Wavelet Transform ……………………20
3.3.2Discrete Wavelet Transform ………………………21
3.4Multi-Resolution Analysis …………………………25

Chapter 4 Pitch Detection Algorithms ………………27
4.1Introduction …………………………………27
4.2The Wavelet-based Methods …………………28
4.2.1The Chen''s Method…………………………28
4.2.2The Shelby''s Method………………………30
4.3Our Proposed Methods ………………………31
4.3.1Our Method I ………………………………31
4.3.2Our Method II………………………………33
4.4Survey of Some Methods ……………………35
4.4.1Auto-correlation Method……………………35
4.4.2AMDF Method ………………………………38
4.4.3Cepstrum Method……………………………40

Chapter 5 Experimental Results and Discussions …………42
5.1Introduction …………………………………………42
5.2Experimental Results …………………………………43
5.2.1Experiments for Chen''s Method ……………………44
5.2.2Experiments for Our Method I ………………………50
5.2.3Experiments for Shelby''s Method……………………57
5.2.4Experiments for Our Method II ………………………60
5.3Comparisons and Discussions …………………………65

Chapter 6 Conclusions and Future Studies …………………69
6.1Conclusions……………………………………………69
6.2Future Studies …………………………………………70
References…………………………………………………71
Vita ………………………………………………………73
[1]S. Kadambe and G. F. Boudreaux-Bartels, “Application of the wavelet transform for pitch detection of speech signals,” IEEE Trans. Information Theory, vol. 38, no. 2, pp. 917-924, Mar. 1992.[2]G. A. Shelby, C. M. Copper, and R. Adhami, “A wavelet based speech pitch detector for tone languages,” in Proc. IEEE International Symposium on Time-Frequency and Time-Scale Analysis, Oct. 1994, pp. 596-599.[3]S. H. Chen and J. F. Wang, “A pyramid-structured wavelet algorithm for detecting pitch period of speech signal,” 1998 International Computer Symposium, Dec. 1998, pp. 50-56.[4]A. M. Noll, “Cepstrum pitch determination,” J. Acoust. Soc. Amer., vol. 41, pp. 293-309, Feb. 1967.[5]L. R. Rabiner, “On the use of autocorrelation analysis for pitch detection,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 25, no. 1, pp. 24-33, Feb. 1977.[6]M. J. Ross, H. L. Shaffer, A. Cohen, R. Freudberg, and H. J. Manley, “Average magnitude difference function pitch extractor,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 22, no. 5, pp. 353-362, Oct. 1974.[7]L. R. Rabiner, M. J. Cheng, A. E. Rosenberg, and C. A. McGonegal, “A comparative performance study of several pitch detection algorithms,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 24, no. 5, pp. 399-417, Oct. 1976.[8]T. V. Anathapadmanabha and B. Yengnanaryana, “Epoch extraction from linear prediction residual for identification of closed glottis interval,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 27, no. 4, pp. 309-319, Aug. 1979.[9]Y. M. Cheng and D. O’Shaugnesy, “Automatic and reliable estimation of glottal source instant and period,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, no. 12, pp. 1805-1815, Dec. 1989.[10]L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals. Englewood Cliffs, NJ: Prentice-Hall, 1978.[11]J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-time Processing of Speech Signals. New York: Maxwell McMillan, 1993.[12]S. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674-693, July 1989.[13]G. Strang and T. Nguyen, Wavelets and Filter Banks. Wellesley, Massachusetts: Wellesley-Cambridge Press, 1996.[14]S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1999.[15]M. Misiti, Y. Misiti, G. Oppenheim, and J. M. Poggi, Wavelet Toolbox User''s Guides. The MathWorks, Inc., 1997.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top