(3.238.7.202) 您好!臺灣時間:2021/02/26 15:10
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:羅安然
研究生(外文):An-Jan Lo
論文名稱:ZZT演算法之應用於語音激發信號擷取
論文名稱(外文):The Source Excitation Extraction of Speech Signal Using ZZT Method
指導教授:貝蘇章
指導教授(外文):Soo-Chang Pei
口試委員:李琳山丁建均
口試委員(外文):Lin-Shan LiJian-Jiun Ding
口試日期:2014-07-25
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電信工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:英文
論文頁數:62
中文關鍵詞:ZZT語音激發/口腔通道信號分離群延遲函數語音特徵峰值擷取
外文關鍵詞:ZZTsource-tract separationgroup delay functionformant extraction
相關次數:
  • 被引用被引用:0
  • 點閱點閱:103
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在信號處理的研究領域中,語音處理占了相當重要的地位。人類語音的產生可模擬為聲門送出的激發信號,與口腔物理構造所形成之濾波器兩部份的迴旋積。對此過程之細節的研究與語音特徵參數的取得,可運用在語音合成、轉換等的諸多領域。本文中,我們將對zeros of z-transform(ZZT) 演算法及其於語音激發信號擷取的應用作討論。經過Z轉換之後,語音信號的zeros 得以在Z平面上展現其mixed-phase (在半徑1之圓內外皆有zero)的性質,並可憑此進行語音激發信號和口腔通道濾波器兩部份的分離。此外,根據ZZT圖形的研究,由群延遲函數(Group delay function) 所取得之相位資訊可以透過Chirp group delay方法得到大幅度的改善,藉以取得口腔通道濾波器的特徵峰值。所取得結果將與現有語音處理工具相比較,並測試主激發信號衰減(Attenuated Main Excitation ,AME) 方法對特徵峰值取得的改善。

Speech processing has been one of the major topics in the research field of signal processing. The process of speech production can be modeled as the convolution of glottal source excitation and vocal tract filter. The research in the details of speech production and the characteristic extraction can be applied in the fields such as speech synthesis and transformation. In this thesis, we discuss the zeros of the z-transform(ZZT) algorithm developed by Dr. Baris Bozkurt[1] and its application to the extraction of the excitation pulse in the source-tract model of human speech signals. After z-transform, the zeros of the speech signals can be represented on z the plane and the mixed-phase property is revealed, which would be used in source-tract separation. On the other hand, by the study of the ZZT plot, the phase information obtained from group delay spectrum could be well improved using the Chirp Group Delay. Moreover, we present the capabilities of formant tracking by ZZT, making a comparison between the performances of ZZT with other speech signal processing tools, and apply Attenuated Main Excitation(AME) for further improvement.

中文摘要 i

ABSTRACT ii

LIST OF FIGURES v

LIST OF TABLES ix

Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Background 1
1.3 LF model of speech voice 4
1.4 ZZT representation of speech signals 5

Chapter 2 ZZT Algorithm 7
2.1 Definition 7
2.2 ZZT of Glottal Signal 7
2.3 Windowing effects on ZZT 10
2.4 Glottal Closure Instant(GCI) Detection 12
2.5 Conclusion 14

Chapter 3 The Source Excitation Extraction using ZZT 15
3.1 Introduction 15
3.2 ZZT Decomposition 15
3.3 Complex Cepstrum 20
3.4 Test with noise 24
3.5 Chirp decomposition 27
3.6 Conclusion 30

Chapter 4 Applications of ZZT
Chirp Group Delay and Formant Tracking 31
4.1 Definition 31
4.2 Application in formant tracking 34
4.2.1 Spectrogram 35
4.2.2 Hilbert-Huang Transform 36
4.2.3 Chirp Group Delay of Zero-Phase Version Signal(CGDZP) 37
4.2.4 Disturbance of High-Pitched Frequency to Formant Tracking 42
4.2.5 Advanced Test for CGDZP and Praat 44
4.2.6 The Effect of Attenuated Main Excitation(AME) on CGDZP 52
4.3 Conclusion 58

Chapter 5 Conclusion and Future Works 59

References 61


[1] Baris Bozkurt, “Zeros of the z-transform (ZZT) representation and chirp group delay processing for the analysis of source and filter characteristics of speech signals”, phD thesis supervised by prof. T. Dutoit, 2005.
[2] Fant, G. and Liljencrants, J. and Lin, Q., ‘A four-parameter model of glottal flow’,
STL-QPSR, vol.26, no. 4, pp.001-013, 1985.
[3] T. Drugman, T. Dutoit, ‘Glottal Closure and Opening Instant Detection from Speech Signals’, Interspeech, 2009.
[4] Dr. Peter Birkholz, Vocal Tract Lab, http://www.vocaltractlab.de/index.php?page =birkholz-contact
[5] T. Drugman , B. Bozkurt , T. Dutoit , “Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation”, Interspeech, 2009.
[6] T. Drugman, B. Bozkurt, T. Dutoit, “Causal–anticausal decomposition of speech using complex cepstrum for glottal source estimation”, Speech Communication, 2011.
[7] B. Yegnanarayana, “Design of recursive group-delay filters by autoregressive modeling”, Acoustics, Speech and Signal Processing, IEEE Trans, 1982.
[8] Paavo Alku and Jouni Pohjalainen, “Formant frequency estimation of high-pitched vowels using weighted linear prediction”, Interspeech, 2012.

[9] Gilles Degottex, “Glottal source and vocal-tract separation estimation of glottal parameters, voice transformation and synthesis using a glottal model”, UPMC, 2010.
[10] B. Yegnanarayana, D. K. Saikia, AND T. R. Krishnan, “Significance of Group Delay Functions in Signal Reconstruction from Spectral Magnitude or Phase”, Acoustics, Speech and Signal Processing, IEEE Trans, 1984.
[11] T. Drugman, B. Bozkurt, T. Dutoit, “Chirp Decomposition of Speech Signals for Glottal Source Estimation”, 2009.
[12] T. Drugman, B. Bozkurt, T. Dutoit, “A Comparative Study of Glottal Source Estimation Techniques”, Computer Speech &; Language, 2012.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔