跳到主要內容

臺灣博碩士論文加值系統

(44.200.168.16) 您好!臺灣時間:2023/04/02 00:28
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:葉新文
研究生(外文):Ye, Shin-Wun
論文名稱:利用麥克風陣列與目標干擾比之強健型語音活動偵測方法
論文名稱(外文):A Robust Voice Activity Detection Method Using Microphone Array and Target-to-Jammer Ratio
指導教授:胡竹生胡竹生引用關係
指導教授(外文):Hu, Jwu-Sheng
學位類別:碩士
校院名稱:國立交通大學
系所名稱:工學院聲音與音樂創意科技碩士學位學程
學門:藝術學門
學類:音樂學類
論文種類:學術論文
論文出版年:2011
畢業學年度:100
語文別:中文
論文頁數:68
中文關鍵詞:麥克風陣列語音活動偵測
外文關鍵詞:Microphone ArrayVoice Activity Detection
相關次數:
  • 被引用被引用:0
  • 點閱點閱:249
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出麥克風陣列與目標干擾比(Target-to-Jammer, TJR)的語音活動特徵搭配混和高斯模型(Gaussian mixture model, GMM)與目標干擾比的語音活動特徵搭配最小控制遞迴平均法 ( Minima Controlled Recursive Averaging , MCRA )的兩種穩健型語音活動偵測方法。並且將此方法與長時間訊號變動程度 (Long-Term Signal Variability)和訊號能量做比較,在大部分的狀況下使用干擾比當語音活動偵測的正確率都高於其它特徵。當訊雜比越來越低的時候,目標干擾比(Target-to-Jammer, TJR)的優勢會越來越顯著。
In this thesis, two methods detecting voice activity by microphone array are proposed. The first method combines target-to-jammer ratio with minima controlled recursive averaging. The second method combines target-to-jammer ratio with Gaussian mixture model. These two methods are compared with signal energy method and long-term signal variability method. In most situations, the correct rate by using target-to-jammer ratio is higher than other features. When signal to noise ratio (SNR) gets lower, the target-to-jammer ration method will be more robust than using other features.
摘 要 ..................................................... I
ABSTRACT ................................................... II
誌謝 .......................................................III
圖目錄 ..................................................... V
表目錄 ....................................................VIII
第一章 緒論 ............................................... 1
1.1 研究動機 ............................................. 1
1.2 研究目標 ............................................. 1
1.3 文獻回顧 ............................................. 2
1.4 論文架構 ............................................. 3
第二章 適應性陣列訊號處理 .................................. 4
2.1 陣列訊號處理 .......................................... 4
2.2 適應性訊號處理 ........................................ 6
2.3 適應性空間濾波器:TRANSFER FUNCTIONS GENERALIZED
SIDELOBE CANCELER (GSC) .................................... 7
第三章 語音活動偵測 ........................................ 12
3.1 語音活動偵測 .......................................... 12
3.2 TARGET-TO-JAMMER RATIO ................................ 13
3.3 長時間訊號變動程度 ( LONG-TERM SIGNAL VARIABILITY) .... 17
3.4 最小控制遞迴平均法 ( MINIMA CONTROLLED RECURSIVE
AVERAGING , MCRA ) ......................................... 19
3.5 兩個元件的混和高斯模型 (GAUSSIAN MIXTURE MODEL) ...... 22
3.6 延續機制( HANG OVER SCHEME) ........................... 25
第四章 實驗結果與分析 ...................................... 27
4.1 TJR 和 LTSV 實驗結果與分析 ............................ 33
第五章 結論 ............................................... 65
5.1 研究成果 .............................................. 65
5.2 未來展望 .............................................. 65
參考文獻 ................................................... 66
[1] J. A. Haigh and J. S. Mason, “Robust voice activity detection using cepstral features,” in Proc. IEEE TENCON, China, 1993, pp. 321–324
[2] Ishizuka, K. et al., "Study of noise robust voice activity detection based on periodic component to aperiodic component ratio," Proc. of SAPA '06, pp.65-70, Sept. 2006
[3] J. L. Shen, J. W. Hung, and L. S. Lee, "Robust entropy-based endpoint detection for speech recognition in noisy environments", ICSLP, 1998
[4] Prasanta Kumar Ghosh, Andreas Tsiarts, Shirkanth Narayanan, “Robust voice activity detection using long-term signal variability, ” IEEE Transactions on audio, speech, and language processing, VOL. 19, NO. 3, MARCH 2011
[5] J. Chen and W. Ser, “Speech Detection Using Microphone Array”, Electronic Letters, Vol. 36, No. 2, pp. 181-182, Jan. 2000
[6] I. Potamitis, “Estimation of Speech Presence Probability in the field of Microphone Array,” IEEE Signal Processing Letters, Vol. 11, No. 12, pp. 956-959, December 2004
[7]Regine Le Bouquin-Jeannes, Gerard Faucon, “Study of a voice activity detector and its influence on a noise reduction system,” Speech Communication, Vol. 16, pp. 245-254, 1995
[8] J. E. Rubio, K. Ishizuka, H. Sawada, S. Araki, T. Nakatani, and M. Fujimoto, “Two-microphone voice activity detection based on the homogeneity of the direction arrival estimates,” in Proc. IEEE Intl. Conf. Acoust., Speech, Signal Process. ICASSP) , pp.385-388, 2007
[9] Michael W. Hoffman, Zhao Li, and Devajani Khataniar, “GSC-Based Spataial Voice Activity Detection for Enhanced Speech Coding in the presence of competing speech,” IEEE Transactions on speech and audio processing, VOL. 9, NO. 2, MARCH 2001
[10] J. Sohn and W. Sung, “A voice activity detector employing soft decision based noise spectrum adaptation,” in Proc. Int. Conf. Acoust., Speech, Signal Process., vol. 1, Seattle, WA, 1998, pp. 365–368
[11] S. Gazor, W. Zhang, “A soft voice activity detector based on a Laplacian-Gaussian model,” IEEE Trans. Speech Audio Process., vol. 11, no. 5, pp. 498–505, Sep. 2003
[12] R. Tahmasbi, S. Rezaei, “Change point detection in GARCH models for voice activity detection,” IEEE Trans. Audio Speech Lang. Process., vol.16, no. 5, pp. 1038–1046, July 2008
[13] O. L. Frost, III, “An algorithm for linearly constrained adaptive array processing,” Proc. IEEE, vol. 60, pp. 926–935, Jan. 1972
[14] L. J. Griffiths and C. W. Jim, “An alternative approach to linearly constrained adaptive beamforming,” IEEE Trans. Antennas Propagat., vol. AP-30, pp. 27–34, Jan. 1982
[15] Sharon Gannot; David Burshtein; Ehud Weinstein, “Signal enhancement using beamforming and Nonstationarity with applications to speech, ” IEEE Transactions on signal processing, Vol. 49, NO. 8, August 2001
[16] I. Cohen and B. Berdugo, "Speech enhancement for nonstationary noise environments", Signal Process. , vol. 81, pp.2403 - 2418 , 2001
[17] Dongwen Ying, Yonghond Yan, Jianwu Dang, Frank K. Sonng, “Voice activity detection based on an unsupervised learning framework, ” Audio, Speech, and Language Processing, IEEE Transactions, 2011
[18] Iain A. McCowan, Herve Bourlard, “Microphone array post-filter basaed on noise field coherence,” IEEE Transactions on speece and audio processing, VOL. 11, NO6, November 2003
[19] I. Cohen and B. Berdugo, "Speech enhancement for nonstationary noise environments", Signal Process. , vol. 81, pp.2403 - 2418 , 2001.
[20] J. Ramirez, T. C. Segura and etc., “Efficient voice activity detection algorithms using long-terms speech information, ” Speech commun., vol.42, no3, pp. 271-287, 2004
[21] K. Ishizuka and T. Nakatani, “Study of noise robust voice activity detection based on periodic component to aperiodic component ratio,” Proc. SAPA ’06, Pittsburgh, USA, 2006, pp. 65-70

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top