

( 您好!臺灣時間:2024/10/11 13:35
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Jheng - Jie Zeng
論文名稱(外文):Sound Source Localization Based on Multiple Signal Classification
指導教授(外文):Hung-Yan Gu
外文關鍵詞:Multiple Signal ClassificationMicrophone ArrayVoice Activity DetectionTime-Stretched PulseMaximum Likelihood Beamformer
  • 被引用被引用:3
  • 點閱點閱:322
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
本論文研究MUSIC(Mutiple Signal Classification)演算法,並且使用它來處理四聲道麥克風陣列的輸入,以實作出一個聲源方位偵測的系統。硬體方面,自行設計麥克風的放大電路,再透過USB介面的資料擷取器(DAQ)將訊號擷取至電腦做運算。軟體方面,先透過VAD(Voice Activity Detection)將訊號判定為語音或非語音,再使用MUSIC演算法,判斷出聲源的方位,然後用以更新最大相似性可調適濾波器(Maximum Likelihood Adaptive Filter)的係數。在做MUSIC演算法的計算之前,須先經由分析TSP(Time-Stretched Pulse)訊號來求得MUSIC所需的脈衝頻率響應係數,這個TSP訊號分析是一個困難的步驟。將軟硬體組合起來後,進行方位角度偵測的評估實驗,初步實驗數據顯示出,只有在半無響室裡才會有較好的偵測效果,而其它環境下的偵測效果並不是很好,而仍有改進的空間。
In this thesis, we study the MUSIC(Mutiple Signal Classification) algorithm, and use it to process the signals acquired from a four-channel microphone array, in order to build a sound source detection system. For hardware implementation, we design a preamplifier circuit for the microphones, and use a DAQ(Data acquisition)with USB interface to transmit signal data to a computer. In software implementation, at first we determine if the input signal is speech by using a VAD(Voice Activity Detection)module, then we determine the direction of the sound source by using the MUSIC module, and finally we update the coefficients of the Maximum Likelihood Adaptive Filter. Beacuse impulse frequency response coefficients are require by the MUSIC algorithm, we have to carry out the hard step: analyzing the TSP(Time-Stretched Pulse)signals. After the hardware and software modules are integrated, we start to evaluate the detection of a sound source’s direction angle. The initial experiment results show, only in the semi- anechoic chamber can accurate angle detection be achieved, while in other circumstances the angle detection accuracy is not satisfactory and needs to be improved.
摘要 I
誌謝 III
目錄 IV
圖表索引 VI
第1章 緒論 1
1.1 研究動機及目的 1
1.2 文獻回顧 2
1.3 研究方法 9
1.4 論文架構 11
第2章 多重訊號分類之聲源角度偵測 13
2.1 聲源角度偵測及語音強化之架構 13
2.2 多重訊號分類之聲源定位方法 15
2.2.1 資料模型 15
2.2.2 多重訊號分類法 16
2.3 麥克風脈衝響應之分析 21
2.3.1 時間展延脈衝訊號介紹 21
2.3.2 時間展延脈衝訊號之產生 24
2.3.3 麥克風頻率響應之分析流程 26
第3章 語音活動偵測與語音強化系統 28
3.1 語音活動偵測之方法 28
3.1.1 語音活動偵測介紹 28
3.1.2 語音活動偵測之方法 28
3.2 適應性訊號處理 32
3.2.1 適應性濾波器介紹 32
3.2.2 最大相似性波束集成法之適應性濾波器 33
第4章 系統製作 36
4.1 硬體組件 36
4.1.1 麥克風 36
4.1.2 訊號放大電路 38
4.1.3 DAQ 39
4.2 系統之軟體實作與參數設定 41
4.2.1 麥克風陣列頻率響應之量測與分析 41
4.2.2 語音活動偵測之實作 46
4.2.3 多重訊號分類聲源角度偵測之改進與實作 50
4.2.4 最大相似性波束集成法之適應性濾波器實作 53
4.2.5 系統之軟體介面 54
第5章 測試實驗 58
5.1 線外系統測試 58
5.2 線上系統測試 64
第6章 結論 67
參考文獻 70
作者簡介 73
[1]R. O. Schmidt, "Multiple emitter location and signal parameter estimation", IEEE Trans. Antennas Propag, Vol. AP-34, no. 3, pp.276-280, March 1986.
[2]Javier Ramirez, Jose C. Segura, Carmen Benitez, Angel de la Torre and Antonio Rubio, "Efficient voice activity detection algorithms using long-term speech information", Speech Communication, Vol. 42, Issues 3-4, pp. 271-287, April 2004.
[3]D. Johnson and D. Dudgeon, Array Signal Processing:Concepts and Techniques, Prentice Hall, Englewood Cliff, New Jersey, 1993.
[4]J. L. Flanagan, L. Landgraf, D. J. McLean, "Matched-filter processing of hydrophone array", J. Acousr. Soc. Am. Vol. 42, pp.1165, November 1967.
[5]National Radio Astronomy Observatory(NRAO), 網頁資料:http://www.vla.nrao.edu/
[6]B. L. Sim, Y. C. Tong, J. S. Chang and C. T. Tan, "A parametric formulation of the generalized spectral subtraction method", IEEE Trans. Speech and Audio Processing, Vol. 6, pp. 328-337, July 1998.
[7]Y. Ephraim and H. L. Van Trees, "A signal subspace approach for speech enhancement", IEEE Trans. Speech and Audio Processing, Vol. 3, No. 4, pp. 251-266, July 1995.
[8]Asano F. , Motomura Y. , Asoh H. , Yoshimura T. ,Ichimura N. , Nakamura S. , "Fusion of audio and video information for detecting speech events", in Proc. Fusion 2003, pp. 386-393, 2003.
[9]Asano F. , Asoh H. , Matsi T. , "Sound source localication and signal separation for office robot “Jijo-2” ", in IEEE Proc. , Multisensor Fusion and Integration for Intelligent Systems, pp. 243-248, August 1999.
[10]Nakadai K. , Hidai K. , Mizoguchi, H. , Hiroshi G. Okuno, Kitano H. , "Real-Time Auditory and Visual Multiple-Object Tracking for Humanoids", IJCAI 2001, pp. 1425-1436.
[11]W. Tager. , "Near field superdirectivity(NFSD)", International Conference on Acoustics, Speech, and Signal Processing(ICASSP), Vol. 4, pp. 2045-2048, May 1998.
[12]M. D. Zoltowski , C. P. Mathews. "Real-time frequency and 2-D angle estimation with sub-nyquist spatio-temporal sampling", IEEE Tran. , SP-42 , pp. 2781~2794, 1994.
[13]D. Giuliani, M. Omologo and P. Svaizer, "Experoments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaption", Proceeding of international conference on Spoken Language Processing(ICSLP), pp. 1329-1332, October. 1996.
[14]Y. Tamai, S. Kagami, Y. Amemiya and H. Nagashima, "Circular. Microphone Array for Robot’s Audition", Proceedings of the Third. IEEE International Conference on Sensors (SENSORS2004), 2004.
[15]Geert Van Meerbergen, Audio en spraakverwerking. http://homes.esat.kuleuven.be/~gvanmeer/s&a/
[16]Ta-Sung Lee, Tsui-Tsai Lin, "Coherent intefrence suppression with complementally transformed adaptive beamformer", Antennas and Propagation, IEEE Transactions, Vol. 46, Issue 5, pp. 609-617, May 1998.
[17]Gollamudi, S. , Yih-Fang Huang , "Optimally combined nonlinear MMSE beamforming and interference cancellation for CDMA communications", Personal Wireless Communications, 2000 IEEE International Conference, pp. 474-478, 2000.
[18]Pillai, S. Unnikrishna, Array signal processing, 1989.
[22]Suzuki Y. , Asano F. , H.-Y. Kim , Toshio Sone, "An optimum computer-generated pulse signal suitable for the measurement of very long impulse responses", J. Acoust. Soc. Am. Vol. 97(2) , pp.1119-1123, 1995.
[23]European Digital Cellular Telecommunications System ; Half rate speech part 6 : Voice Activity Detector (VAD) for half rate speech traffic channels ( ETSI GSM 6.42 ) , 1995.
[24]European Digital Cellular Telecommunications System ; Half rate Speech ; Half rate speech transcoding ( ETSI GSM 6.20 ) , 1995.
[25]Aoshima N. “Computer-generated pulse signal applied for sound measurement.” J. Acoust. Soc. Am. Vol. 69, pp. 1484-1488, 1981.
[26]National Semiconductor, LM386 low voltage audio power amplifier datasheet.
[27]Wikipedia, Low-pass filter.
[28]National Instruments, Low-cost multifunction DAQ for USB.
第一頁 上一頁 下一頁 最後一頁 top