跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.80) 您好!臺灣時間:2024/12/08 23:06
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:黃仁杰
研究生(外文):Ren-Jie Huang
論文名稱:小波轉換應用於語音訊號處理之研究
論文名稱(外文):A STUDY ON SPEECH SIGNAL PROCESSING USING WAVELET TRANSFORMS
指導教授:李清坤
指導教授(外文):Ching-Kuen Lee
學位類別:碩士
校院名稱:大同大學
系所名稱:通訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:英文
論文頁數:77
中文關鍵詞:小波語音
外文關鍵詞:waveletspeech
相關次數:
  • 被引用被引用:0
  • 點閱點閱:361
  • 評分評分:
  • 下載下載:14
  • 收藏至我的研究室書目清單書目收藏:0
小波轉換為近年來相當熱門的研究題目之一,小波定理提供了統一的架構給許多不同的訊號處理應用領域.目前小波轉換已廣泛地應用在通信系統、信號處理、影像和音訊處理等不同的研究領域。由於小波轉換具有極佳的時域、頻域分析功能以及多重解析的特性,因此很適合運用在具有高時變性的語音信號上。本論文研究主題為小波轉換在語音信號處理上之應用,共分為三個部分,包含: 語音信號偵測、子母音分割,及音高週期求取等三部分。
首先,在語音活動檢測上, 以小波轉換為基礎發展出短時段的語音活動檢測判斷法,實驗結果顯示本論文所提的方法在高雜訊 (SNR=0dB) 環境下仍有極高的語音信號偵測率﹐且勝過GSM Enhanced Full Rate通訊系統所提出的語音活動檢測判斷結果。其次,在子母音分割方面,參考Chen和Wang 所提出的以小波轉換搭配積函數的演算法為基礎,再加上新的分割點判斷演算法﹔與傳統方法比較之下﹐本論文所提的判斷分割點的演算法可提昇子母音分割的準確度,且在雜訊環境下也可求得精確的子母音分割點。最後在音高週期求取部份, 為加強傳統演算法抗雜訊的能力﹐本論文以小波轉換為基礎﹐加上circular average magnitude difference function (CAMDF) 方法﹐來求取音高週期﹐實驗結果顯示本論文所提的演算法無論在乾淨或雜訊環境中均具有不錯的成效。
The wavelet transform is one of the most exciting developments of the last decade. Wavelet theory provides a unified framework for a number of techniques which had been developed independently for various signal processing applications. Due to the wavelet representation has characteristics of the efficient time-frequency localization and the multi-resolution analysis; the wavelet transforms are suitable for processing the non-stationary signals such as speech. Based on the Wavelet framework, this thesis develops three wavelet-based speech signal processing algorithms including voice active detection (VAD), consonant/vowel (C/V) segmentation, and pitch detection.
The first part is the wavelet-based voice active detection algorithm on a frame by frame basis. Experimental results show that the proposed VAD algorithm is capable of outperforming to the VAD of Enhanced Full Rate GSM-based system and can operate reliably in noisy environments (SNR=0dB). Then, this thesis makes use of wavelet transform and energy profile to indicate the C/V segmentation point and is no need to set any predetermined threshold. It is shown that the C/V the segmentation point can be accurately pointed out with a low computation complexity. Final, In the light of the properties of wavelet transform and circular average magnitude difference function, a new pitch detection algorithm is proposed. The simulation results show that new method can detect the pitch period accurately when other methods can‘t when SNR is in 0dB.
ABSTRACT IN CHINESE I
ABSTRACT IN ENGLISH II
ACKNOWLEDGEMENTS III
CONTENTS IV
LIST OF FIGURES VI
LIST OF TABLES VIII

CHAPTER 1 INTRODUCTION 1
1.1 Introduction 1
1.2 Motivation 1
1.3 Thesis Focus 2

CHAPTER 2 REVIEW OF WAVELET TRANSFORM 4
2.1 History of Wavelets 4
2.2 Properties of Wavelet Analysis 5
2.3 Theory of Wavelets 8
2.3.1 Continuous Wavelet Transform 8
2.3.2 Discrete Wavelet Transform 11
2.3.3 Multi-Resolution Analysis 12

CHAPTER 3 VOICE ACTIVITY DETECTION USING WAVELET PACKET TRANSFORM 18
3.1 Introduction 18
3.2 VAD for Enhanced Full Rate (EFR) Speech Traffic Channels in GSM System 20
3.2.1 Overview and Principles of Operation 20
3.2.2 Block Diagram Description 21
3.2.3 The Experiments and Result of GSM EFR VAD 22
3.3 The proposed VAD algorithm 24
3.3.1 Wavelet Packet Transform 24
3.3.2 The Structure of Proposed VAD Algorithm 27
3.4 Experimental Results 29
3.5 Summary 30

CHAPTER 4 APPLICATION OF WAVELET TRANSFORMS ON C/V SEGMENTATION FOR MANDARIN SPEECH SIGNAL 31
4.1 Introduction 32
4.2 Implementation of the Wavelet-Based C/V Segmentation Algorithm 35
4.3 The Modify C/V Segmentation Algorithm 40
4.4 Experimental Results 44
4.5 Summary 48

CHAPTER 5 A NOISE-ROBUST PITCH DETECTION METHOD USING WAVELET TRANSFORM 49
5.1 Introduction 50
5.2 The Pitch Detection Methods 51
5.2.1 The Autocorrelation Method 51
5.2.2 The Average Magnitude Difference Function 53
5.2.3 The Circular Average Magnitude Difference Function 56
5.2.4 The Proposed Pitch Detection Method 58
5.3 Experimental Results and Performance Evaluation 60
5.4 Summary 62

CHAPTER 6 CONCLUSIONS 63

REFERENCES 65
[1] M. Misiti, Y. Misiti, G. Oppenheim, and J. M. Poggi, Wavelet Toolbox User's Guides, The Math Works, Inc. , 1997.
[2] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Englewood Cliffs, NJ: Prentice-Hall, 1978.
[3] A. Grossman and J. Morlet, “Decompositions of Hardy Functions into Square Integrable Wavelets of Constant Shape,” SIAM Journal of Mathematical Analysis, vol. 15, no. 4, pp. 723-736, Jul. 1984.
[4] S. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,” IEEE Trans., Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674-693, Jul. 1989.
[5] I. Daubechies, “Orthonormal bases of compactly supported wavelets,” Comm., Pure and Applied Math, vol. 41, no. 7, pp. 909-996, 1988.
[6] S. H. Chen, H. T. Wu, C. H. Chen, J. C. Ruan, and T. K. Truong, “Robust voice activity detection algorithm based on the perceptual wavelet packet transform,” Proc., ISPACS, Dec. 2005, pp. 45-48.
[7] GSM “Voice Activity Detector (VAD) for Enhanced Full Rate (EFR) speech traffic channels,” 3GPP TS 46.082 v6.0.0, Dec. 2004.

[8] J. C. Junqua, B. Reaves, and B. Mak, “A study of endpoint detection algorithms in adverse conditions: Incidence on a DTW and HMM recognize,” Proc., Eurospeech, 1991, pp. 1371-1374.
[9] J. A. Haigh and J. S. Mason, “Robust voice activity detection using cepstral features,” Proc., IEEE TENCON, 1993, pp. 321-324.
[10] R. Tucker, “Voice activity detection using a periodicity measure,” IEE Trans., Communications, Speech and Vision, vol. 139, no. 4, pp. 377-380, Aug. 1992.
[11] D. K. Freeman, G. Cosier, C. B. Southcott, and I. Boyd, “The voice activity detector for the pan European digital cellular mobile telephone service,” Proc., ICASSP, May 1989, pp. 369-372.
[12] B. Yegnanarayana, and R. N. J. Veldhuis, ”Extraction of vocal-tract system characteristics from speech signals,” IEEE Trans., Speech and Audio Processing, vol. 6, no. 4, pp. 313-327, July 1998.
[13] M. Jiang, B. Yuan and B. Lin, ”The consonant/vowel (C/V) speech classification using high-rank function neural network (HRFNN),” Proc., 3rd International Conference on Signal Processing, 1996, pp. 1469-1472.
[14] J. F. Wang, S. H. Chen and J. S. Shyuu, “Wavelet Transforms for Speech Signal Processing,” Journal of The Chinese Institute of Engineers, vol. 22, no. 5, pp. 549-560, Sept. 1999.
[15] J. F. Wang and S. H. Chen, “A C/V Segmentation Algorithm for Mandarin Speech Signal Based on Wavelet Transforms,” Proc., ICASSP, March 1999, vol. 1, pp. 417-420.
[16] J. F. Wang, C. H. Wu, S. H. Chang, and J. Y. Lee, “A Hierarchical Neural Network Model Based on a C/V Segmentation Algorithm for Isolated Mandarin Speech Recognition,” IEEE Trans., Signal Processing, vol. 39, no. 9, pp. 2141-2146, Sept. 1991.
[17] S. W. K. Fu, C. H. Lee, and O. L. Clubb, “A Robust C/V Segmentation Algorithm for Cantonese,” Proc., IEEE TENCON, Nov. 1996, vol. 1, pp.42-45.
[18] L. S. Lee, C. Y. Tseng, and H. Y. Gu, “Golden Mandarin - A Real-Time Mandarin Speech Dictation Machine for Chinese Language with Very Large Vocabulary,” IEEE Trans., Speech and Audio Processing, vol. 1, no. 2, pp. 158-179, April 1993.
[19] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Englewood Cliffs, NJ: Prentice-Hall, 1978.
[20] S. Kadambe and G. F. Boudreaux-Bartels, “Application of the wavelet transform for pitch detection of speech signals,” IEEE Trans., Information Theory, vol. 38, no. 2, pp. 917-924, Mar. 1992.


[21] G. A. Shelby, C. M. Copper, and R. Adhami, “A wavelet based speech pitch detector for tone languages,” Proc., IEEE International Symposium on Time-Frequency and Time-Scale Analysis, Oct. 1994, pp. 596-599.
[22] S. H. Chen and J. F. Wang, “A pyramid-structured wavelet algorithm for detecting pitch period of speech signal,” Proc., International Computer Symposium, Dec. 1998, pp. 50-56.
[23] S. Ahmadi and A. S. Spanias, “Cepstrum-based pitch detection using a new statistical V/UV classification algorithm,” IEEE Trans., Speech and Audio Processing, vol. 7, no. 3, pp. 333-338, May 1999.
[24] G. S. Ying, L. H. Jamieson, and C. D. Michell, “A probabilistic approach to AMDF pitch detection,” IEEE Trans., Digital Object Identifier, vol. 2, pp. 1201-1204, Oct. 1996.
[25] G. Xu, and L. R. Tang, “Speech pitch period estimation using circular AMDF, ” IEEE Trans., Personal, Indoor and Mobile Radio Communications, vol. 3, pp. 2452-2455, 2003.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top