跳到主要內容

臺灣博碩士論文加值系統

(100.28.0.143) 您好!臺灣時間:2024/07/23 10:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳威廷
研究生(外文):Wei-TingChen
論文名稱:多層卡爾曼濾波器於雙麥克風語音降噪之應用
論文名稱(外文):Dual-Microphone Applications of Multi-Layer Kalman Filter in Speech Enhancement
指導教授:陳永裕陳永裕引用關係
指導教授(外文):Yung-Yu Chen
學位類別:碩士
校院名稱:國立成功大學
系所名稱:系統及船舶機電工程學系
學門:工程學門
學類:機械工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:154
中文關鍵詞:卡爾曼濾波器自適應濾波器語音降噪雙麥克風降噪
外文關鍵詞:Kalman filtersadaptive filterbackground noise cancellationdual-microphone noise reduction
相關次數:
  • 被引用被引用:0
  • 點閱點閱:191
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
數位語音訊號現已廣泛的應用在遠程通信、視訊會議、人工智能系統等眾多領域,然而在實際的環境中,語音訊號在麥克風的聲電轉換中不可避免地會受到周圍環境各種聲源的干擾,經常導致語音接收端辨讀的困擾,故如何降低周遭環境噪音,保留低失真語音訊號已是現代語音訊號處理的一大課題。
本研究提出了一種基於兩種收音能力不同的麥克風(全向麥克風、指向麥克風)的多層卡爾曼濾波器來估測出含低雜訊、低失真的語音訊號,首先指向麥克風設置於背對主要語音聲源來收集環境背景噪音,並建立成擴展狀態空間表示示模型,於此應用第一層卡爾曼濾波器,估測出較為準確的環境背景噪音,再來以估測的結果將設置於面對主要語音聲源的全向麥克風所收集的所有環境聲源(主要語音聲源+環境背景噪音)白雜訊化,並再次建立含些微雜訊的主要語音聲源擴展狀態空間表示示模型,第二層卡爾曼濾波器即可應用於此估測出較為乾淨的主要語音源,可以藉由類似上述的步驟,隨著層數的增加,不斷的對已估測出的主要語音訊號建模並進一步的估測出更為乾淨的主要語音,以此達到降低環境噪音,保留低失真語音的目的。
研究過程將輸入多組SNR低於零的訊號,並利用improved SNR、cross-correlation、PESQ、spectrogram等分析工具來判斷估測出的主要語音源訊號品質,並考慮實際的可行性,在可即時化處理的條件下,比對出可得最佳主要語音訊號品質的濾波器層數與階數。從研究成果顯示出,本論文提出的方法確實能大幅增強主要語音訊號的品質。

Digital speech signal has been widely used in many fields, such as telecommunications, video conference, and artificial intelligence systems. However, the speech signal inevitably disturbed by the various sound sources of surroundings in the actual communication environments due to the microphone acoustic-electric conversion often lead to recognizable problems for the speech receiving end. So how to reduce the environment noise and maintain low distortion speech signal is a major issue in speech signal processing now, and attract a lot of attention.
This study proposes a method called multilayer Kalman filter design based on two different capability microphones (omnidirectional microphone and unidirectional microphone) in hardware to estimate the speech signal and remove the background noises. First, the unidirectional microphone which is used to collect the ambient background noises is built up in the rear side and then the first-layer Kalman filter is applied on modeling the state-space model of background noises and estimate accurately background noises, simultaneously. Using the estimating results of background noises to whiten the ambient sound sources (main sound source and background noise) which are collected by the omnidirectional microphone set up directly to face the main sound source is to get the roughly main sound source model. Therefore, the second-layer Kalman filter can be applied on the roughly main sound source model to extract more pure main sound source. By following the similar filtering process mentioned above, the proposed multilayer Kalman filter with the increment of layers can achieve the goal of reducing ambient background noises and maintain low distortion of the main source.
For testing the robustness of this proposed method, several signals which the initial SNRs are all lower than 0dB are created, and the qualities of estimated results will be verified by the improved SNR, cross-correlation, PESQ, and spectrogram. The filter’s layers and orders will be determined to get the optimal quality of speech signal under the consideration of the practical instant processing ability. From the revealed research results, the proposed method in this paper can significantly enhance the quality of the speech signal.

中文摘要 I
ABSTRACT II
誌謝 IV
CONTENTS V
LIST OF FIGURES VII
LIST OF TABLES XVII
NOMENCLATURES XIX
CHAPTER 1 INTRODUCTION 1
1.1 Research Motivation 1
1.2 Literature Review 1
1.3 Research Method 3
CHAPTER 2 THEORETICAL DERIVATION AND SYSYEM PARAMETERS 5
2.1 Multi - Layers Kalman Filter Design 5
2.2 Specifications and Assessment Tools 11
CHAPTER 3 SIMULATION RESULTS 15
3.1 Simulation Tests of Multi-Layers Kalman Filter with Different System Orders 15
3.1.1 Scenario 1: The female speech is mixed up with a music: Initial SNR -7.2562 dB 15
3.1.2 Scenario 2: The male speech is mixed up with a music: Initial SNR is -17.1578 dB 39
3.1.3 Scenario 3: The male speech is mixed up with temple fair noises: Initial SNR is -10.5059 dB 63
3.1.4 Scenario 4: The male speech is mixed up with construction site noises: Initial SNR -8.4902 dB 81
3.1.5 Scenario 5: The female speech is mixed up with subway noises: Initial SNR -1.6616 dB 99
CHAPTER 4 PRACTICAL IMPLEMENTATION 118
4.1 Arrangement of Microphones 118
4.2 Feasibility of Real-Time Multi-Layer Kalman Filter 119
4.2.1 Scenario 1: The female speech is mixed up with a music: Initial SNR -7.2562 dB 120
4.2.2 Scenario 2: The male speech is mixed up with a music: Initial SNR is -17.1578 dB 123
4.2.3 Scenario 3: The male speech is mixed up with temple fair noises: Initial SNR is -10.5059 dB 125
4.2.4 Scenario 4: The male speech is mixed up with construction site noises: Initial SNR -8.4902 dB 128
4.2.5 Scenario 5: The female speech is mixed up with subway noises: Initial SNR -1.6616 dB 130
4.2.6 Summary of real-time implementation 133
4.3 Comparisons of Multi-Layers Kalman Filter and Multi-Layers LMS Filter 144
4.3.1 Scenario 1: The female speech is mixed up with a music: Initial SNR -7.2562 dB 145
4.3.2 Scenario 2: The male speech is mixed up with a music: Initial SNR is -17.1578 dB 146
CHAPTER 5 CONCLUSIONS 149
REFERENCES 151

[1].L. Watts, “Advanced noise reduction for mobile telephony, IEEE Computer, vol. 41, no. 8, pp. 90–92, 2008.
[2].H. Hu, “A real-time implementation of constrained estimate-maximize algorithm for single-microphone speech enhancement, IEEE Transactions on Consumer Electronics, vol. 44, no. 2, pp. 370-375, 1998.
[3].Y. Lee, I. Lee, and O. Kwon, “Single-channel speech separation using phase based methods, IEEE Transactions on Consumer Electronics, vol. 56, no. 4, pp. 2453-2459, 2011.
[4].Y. Lee, and O. Kwon, “Application of shape analysis techniques for improved CASA-based speech separation, IEEE Transactions on Consumer Electronics, vol. 55, no. 1, pp. 146-149, 2009.
[5].Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 33, pp. 443-445, Apr. 1985.
[6].S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no.2, pp. 113-121, Apr. 1979.
[7].R. Martin, Spectral subtraction based on minimum statistics, Proceedings of European Signal Processing Conference, pp.1182-1185, Sep. 1994.
[8].R. Le Bouquin, G. Faucon, Using the coherence function for noise reduction, IEE Proceedings-1, vol. 139, no. 3, pp.276-280, June 1992.
[9].J. Cadzow and O. Solomon, Linear modeling and the coherence function, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 35, no. 1, Jan 1987.
[10].A. Guerin, R. Le Bouquin, and G. Faucon, A two-Sensor noise reduction system: Applications for Hands-free Car Kit, EURASIP Journal on Applied Signal Processing, pp.1125-1134, Nov 2003.
[11].Xuefeng Zhang and Ying Jia, A soft decision based noise cross power spectral density estimation for two-microphone speech enhancement systems, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
[12].Jurgen Freudenberger, Sebastian Stenzel, and Benjamin Venditti, A noise PSD and cross-PSD estimation for two-microphone speech enhancement systems, IEEE/SP 15th Workshop on Statistical Signal Processing, 2009.
[13].Mario Di Paola, Francesco P. Pinnola, Cross-power spectral density and cross-correlation representation by using fractional spectral moments, Meccanica dei materiali e delle Strutture, vol. 3, no. 2, pp. 9-16, 2012.
[14].Keunsang. Lee, Joseph Cho and Youngcheol Park, Channel prediction-based noise reduction algorithm for dual-microphone mobile phones, IEEE Transactions on Consumer Electronics, vol. 60, no. 3, Aug 2014.
[15].Parham Aarabi and Guangji Shi, Phase-based dual-microphone robust speech enhancement, IEEE Transactions on Systems, Man, and Cybernetics—part B: Cybernetics, vol. 34, no. 4, Aug 2004.
[16].X. Huang, A. Acero, and H. Hon, Spoken Language Processing, Prentice Hall PTR, pp. 477-544, 2001.
[17].J. Bitzer, K. Simmer, and K. Kammeyer, “Multi-microphone noise reduction techniques for hands-free speech recognition—A comparative study, Robust Methods for Speech Recognition in Adverse Conditions, pp. 171–174, May 1999.
[18].D. H. Jonhnson and D. E. Dudgeon, Array Signal Processing: Concepts and Techniques, Englewood Cliffs, NJ: Prentice-Hall, 1993.
[19].C. Marro, Y. Mahieux, and K. U. Simmer, “Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering, IEEE Trans. Speech Audio Process, vol. 6, no. 3, pp. 240–259, May 1998.
[20].S. Oh, V. Viswanathan, and P. Papamichalis, “Hands-free voice communication in an automobile with a microphone array, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 1, pp. 281–284, Mar 1992.
[21].S. Oh and V. Viswanathan, “Microphone array for hands-free voice communication in a car, Modern Methods of Speech Processing, pp. 351–375, 1995.
[22].J. M. Kates, Digital Hearing Aids, San Diego, CA: Plural, 2008
[23].J. D. Gibson, B. Koo, and S. D. Gray, “Filtering of colored noise for speech enhancement and coding, IEEE Transactions on Signal Processing, vol. 39, no. 8, pp. 1732–1742, 1991.
[24].Q. Mai, D. He, Y. Hou and Z. Huang, “A fast adaptive Kalman filtering algorithm for speech enhancement, IEEE International Conference on Automation Science
and Engineering, Trieste, August, 2011.
[25].Y. Shao, C. H. Chang, “A Kalman filter based on wavelet filter-bank and psychoacoustic modeling for speech enhancement, IEEE International Symposium on Circuits and Systems, Island of Kos, May, 2006.
[26].Y. Wang, M. Brookes, “Speech enhancement using a robust Kalman filter post-processor in the modulation domain, IEEE International Conference on Acoustic, Speech and Signal Processing, Vancouver, BC, May, 2013.
[27].Y. Wang, M. Brookes, “Speech enhancement using a modulation domain Kalman filter post-processor with a Gaussian Mixture noise model, IEEE International Conference on Acoustic, Speech and Signal Processing, Florence, Italy, May, 2014.
[28].Greg Welch and Gary Bishop, “An Introduction to the Kalman Filter, Department of Computer Science University of North Carolina at Chapel Hill, NC 27599-3175

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top