(3.236.175.108) 您好!臺灣時間:2021/03/01 12:25
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:劉維瀚
研究生(外文):Wei-Han Liu
論文名稱:雙耳特徵差異分佈模版於非靜態聲源之定位研究
論文名稱(外文):Binaural room distribution pattern for nonstationary sound source localization
指導教授:胡竹生胡竹生引用關係
指導教授(外文):Jwu-Sheng Hu
學位類別:博士
校院名稱:國立交通大學
系所名稱:電機與控制工程系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:96
語文別:中文
論文頁數:84
中文關鍵詞:聲源定位雙耳聽覺非靜態聲源
外文關鍵詞:sound source localizationdoanonstationary sound sourceHRTFIPDILD
相關次數:
  • 被引用被引用:0
  • 點閱點閱:179
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:38
  • 收藏至我的研究室書目清單書目收藏:0
在現實聲源定位的應用環境中,自然聲源之統計特性通常為非靜態(nonstationary),而環境則會造成複雜的迴響(reverberation)。因此,非靜態聲源於迴響環境中之定位,即成為工程學上重要的研究議題。本篇論文探討非靜態聲源與雙耳特徵差異(IPD、ILD)間之關係。在本篇論文中,採用移動極點模型的概念,提出以指數多項式建立非靜態聲源的強度波動模型。根據此模型,本論文提出利用IPD、ILD的分佈模版做為聲源定位之充分條件,並解釋分佈模版中多重峰值出現出原因。此外,本論文亦提出以高斯混合模型為基礎之「高斯雙耳特徵差異分佈模型」(GMBRDM),作為非靜態聲源定位之演算法。此部分所提出之理論與演算法,皆有模擬或實驗結果加以討論與驗證。
除此之外,本論文將研究之非靜態聲源定位之方法應用於機器人室內定位環境,提出一創新之機器人位置與方向偵測系統。此系統適用於迴響複雜度高之環境,並具有對雜訊穩健之特性。實驗結果顯示,本系統可以用於近場與遠場環境,亦可在機器人與麥克風間無直接傳導路徑時使用。由於本系統可以執行機器人之全域定位,因此適合與其他定位方式整合,作為提供初始化參數或補償之用。
Nature sound sources are usually nonstationary and the real environment contains complex reverberations. Therefore, nonstationary sound source localization in a reverberant environment is an important research topic. This dissertation discusses the relationships between the nonstationarity of sound sources and the distribution patterns of interaural phase differences (IPDs) and interaural level differences (ILDs) based on short-term frequency analysis. The level fluctuation of nonstationary sound sources is modeled by the exponent of polynomials from the concept of moving pole model. According to this model, the sufficient condition for utilizing the distribution patterns of IPDs and ILDs to localize a nonstationary sound source is suggested and the phenomena of multiple peaks in the distribution pattern can be explained. Simulation is performed to verify the proposed analysis. Furthermore, a Gaussian-mixture binaural room distribution model (GMBRDM) is proposed to model distribution patterns of IPDs and ILDs for nonstationary sound source localization. The effectiveness and performance of the proposed GMBRDM are demonstrated by experimental results.
The proposed nonstationary sound source localization algorithm is adopted for robot localization application. A novel and robust robot location and orientation detection method based on sound field features is proposed. Unlike conventional methods, the proposed method does not explicitly utilize the information of direct sound propagation path from sound source to microphones, nor attempt to suppress the reverberation and noise signals. Instead, the proposed method utilizes the sound field features obtained when the robot is at different location and orientation in an indoor environment. The experimental results show that the proposed method using only two microphones can detect robot’s location and orientation under both line-of-sight and non-line-of-sight cases and can be applied to both near-field and far-field conditions. Since this method can provide global location and orientation detection, it is suitable to fuse with other localization methods to provide initial conditions for reduction of the search effort, or to provide the compensation for localizing certain locations that cannot be detected using other localization methods.
Chapter 1 Introduction.1
1.1 Sound Source Localization Using Binaural Information.1
1.1.1 Azimuth Localization Using Binaural Localization Cues.2
1.1.2 Elevation Localization Using Binaural Localization Cues.2
1.1.3 Distance Localization Using Binaural Localization Cues.3
1.2 An Overview of Microphone-Array-Based Direction of Arrival Estimation 3
1.2.1 Steer-Beamformer-Based Algorithms.4
1.2.2 Eigen-Structure-Based DOA Estimation Algorithms.5
1.2.3 Time-Delay of Arrival Based Algorithms.6
1.3 Known Problems in Sound Source Localization.7
1.4 Contribution of this Dissertation.9
1.5 Organization of this Dissertation.10
Chapter 2 Nonstationary Sound Source Localization Using Binaural Room Distribution Pattern.11
2.1 Introduction.11
2.2 The Relation between the Nonstationary Sound Source and the BRDP.16
2.2.1 IPDs and ILDs of Stationary Sound Source.16
2.2.2 IPDs and ILDs of Nonstationary Sound Source.18
2.2.3 Modeling the Nonstationary Sound Source Using Moving Pole Model.19
2.3 Simulation Verification and Discussion of Proposed Model.22
2.3.1 Content Dependency of BRDPs Obtained from Nonstationary Sound Source.22
2.3.2 The Formation of Peaks in the Distribution Patterns of IPDs.27
2.3.3 The Formation of Peaks in the Distribution Patterns of ILDs.30
2.3.4 Localization of Nonstationary Sound Source Using BRDPs.31
2.4 GMBRDM for Nonstationary Sound Source Localization.32
2.4.1 The Training Procedure of the Proposed GMBRDM.32
2.4.2 The Testing Procedure of the Proposed GMBRDM.37
2.5 Summary.38
Appendix.39
Chapter 3 Indoor Sound Field Feature Matching for Robot’s Location and Orientation Detection.41
3.1 Introduction.41
3.1.1 Traditional Sound Based Robot Localization Methods and Known Problems.42
3.1.2 The Proposed Method.43
3.2 System Architecture.44
3.3 Directional Sound Pattern Design for Robot Orientation Detection .47
3.4 Robot Localization Model (RLM) and Robot Orientation Model (ROM).50
3.4.1 A Description of the Proposed RLM and ROM.50
3.4.2 Location and Orientation Detection.55
3.5 Summary.56
Chapter 4 Experimental Results.58
4.1 Experimental Results of the Proposed GMBRDM.58
4.1.1 The Experimental Environment.58
4.1.2 The Experimental Results.61
4.2 Experimental Results of the Proposed Robot’s Localization and Orientation Detection Method.63
4.2.1 The Experimental Environment.63
4.2.2 The Experimental Results.67
Chapter 5 Conclusions and Potential Research Topics.75
5.1 Conclusions.75
5.2 Potential Research Topics.76
5.2.1 The Prediction, Interpolation, or Extrapolation of BRDPs.76
5.2.2 The Influence of Environmental Change to the BRDPs.77
5.2.3 Robot’s Location and Orientation Detection Using Hidden Markov Model.77
References.79
[1]M. Brandstein and H. Silverman, “A Practical Methodology for Speech Source Localiration with Microphone arrays,” Compurer, Speech, and Language, vol. 11, no.2, pp. 91-126, Apr. 1997.
[2]F. L. Wightman and D. J. Kistler, “Headphone simulation of free-field listening. I: Stimulus synthesis,” Journal of Acoustical Society of America, vol. 85, pp. 858-867, Feb. 1989.
[3]F. L. Wightman and D. J. Kistler, “Headphone simulation of free-field listening. II: Psychophysical validation,” Journal of Acoustical Society of America, vol. 85, pp. 868-878, Feb. 1989.
[4]S. Carlile, Virtual auditory space: Generation and application, New York: Chapman and Hall, 1996.
[5]W. G. Gardner and K. D. Martin, “HRTF measurements of a KEMAR,” Journal of Acoustical Society of America, vol. 97, no. 6, pp. 3907-3909, June 1995.
[6]H. S. Colburn and A. Kulkarni, “Models of sound localization,” in Sound Source Localization, R. Fay and T. Popper, Eds., Springer Handbook of Auditory Research, Springer-Verlag, 2005.
[7]J. C. Middlebrooks and D. M. Green, “Sound localization by human listeners,” Annu. Rev. Psychol., vol. 42, pp. 135-159, Jan. 1991.
[8]D. S. Brungart, N. I. Durlach, and W. M. Rabinowitz, “Auditory localization of nearby sources. II. Localization of a broadband source,” Journal of Acoustical Society of America, vol. 106, no. 4, pp. 1956-1968, Oct. 1999.
[9]C. Trahiotis, L. R. Bernstein, R. M. Stern, and T. N. Buell, “Interaural correlation as the basis of a working model of binaural processing: an introduction,” in Sound Source Localization, R. Fay and T. Popper, Eds., Springer Handbook of Auditory Research, Springer-Verlag, 2005.
[10]C. H. Knapp, and G. C. Carter, “The generalized correlation method for estimation of time delay,” IEEE Transactions on Acoustic, Speech, Signal Processing, vol. 24, pp. 320-327, Aug. 1976.
[11]G. C. Carter, A. H. Nuttall, and P. G. Cable, “The smoothed coherence transform,” IEEE Signal Processing Letters, vol. 61, pp. 1497-1498, Oct. 1973.
[12]R. S. Woodworth, Experimental psychology, New York Halt, 1938.
[13]P. M. Hofman and A. J. von Opstal, “Spectro-temporal factors in two-dimensional human sound localization,” Journal of Acoustical Society of America, vol. 103, no. 5, pp. 2634-2648, May 1998.
[14]J. P. Blauert, Spatial Hearing, MIT Prees, Cambridge, MA, 1983.
[15]V. R. Algazi, C. Avendano, and R. O. Duda, “Elevation localization and head-related transfer function analysis at low frequencies,” Journal of Acoustical Society of America, vol. 109, no. 3, pp. 1110-1122, Mar. 2001.
[16]P. Zakarouskas and M. S. Cynader, “A computational theory of spectral cue localization,” Journal of Acoustical Society of America, vol. 94, no. 3, pp. 1323-1331, Sept. 1993.
[17]J. C. Middlebrooks, “Narrow-band sound localization related to external ear acoustics,” Journal of Acoustical Society of America, vol. 92, no. 5, pp. 2607-2624, Nov. 1992.
[18]B. G. Shinn-Cunningham, “Distance cues for virtual auditory space,” Proceedings of IEEE Conference on Multimedia, pp. 227-230, Dec. 2000
[19]J. G. Ryan, R. A. Goubran, “Near-field beamforming for microphone arrays,” IEEE Internationa Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 21-24, Apr. 1997.
[20]M. Wax and T. Kailath, “Optimal localization of multiple sources by passive arrays,” IEEE Transactions on Acoustic, Speech, Signal Processing, vol. ASSP-31, pp. 1210-1217, Oct. 1983.
[21]H. F. Silverman and S. E. Kirtman, “A two-stage algorithm for determining talker location from linear microphone-array data,” Computer, Speech, and Language, vol. 6, pp. 129-152, Apr. 1992.
[22]D. B. Ward, and R. C. Williamson, “Particle filter beamforming for acoustic source localization in a reverberant environment,” IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1777-1780, May 2002.
[23]R. V. Balan and J. Rosca, “Apparatus and method for estimating the direction of Arrival of a source signal using a microphone array,” European Patent, No US2004013275, 2004.
[24]M. Wax, T. J. Shan, and T. Kailath. “ Spatio-temporal spectral analysis by eigenstructure methods,” IEEE Transactions on Acoustic Speech, Signal Processing, vol ASSP-32, pp. 817–827, Aug. 1984.
[25]R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Transactions on Antennas and Propagation, vol. 34, pp. 276-280. Mar. 1986.
[26]H. Wang and M. Kaveh, “Coherent signal subspace processing for detection and estimation of angle of arrival of multiple wideband sources,” IEEE Transactions on Acoustic Speech, Signal Processing, vol ASSP-33, pp. 823–831, Aug. 1985.
[27]G. Bienvenu, “Eigensystem properties of the sampled space correlation matrix,” IEEE International Conference on Acoustic, Speech, and Signal Processing, pp. 332–335, Apr. 1983.
[28]K. M. Buckley and L. J. Griffiths, “Eigenstructure based broadband source location estimation,” IEEE International Conference on Acoustic, Speech, and Signal Processing, pp. 1869–1872, Apr. 1986.
[29]M. A. Doron, A. J. Weiss, and H. Messer, “Maximum likelihood direction finding of wideband sources,” IEEE Transactions on Signal Processing, vol. 41, pp. 411–414, Jan. 1993.
[30]M. Agarwal and S. Prasad, “DOA estimation of wideband sources using a harmonic source model and uniform linear array,” IEEE Transactions on Signal Processing, vol. 47, pp. 619-629, Mar. 1999.
[31]H. Messer, “The potential performance gain in using spectral information in passive detection/localization of wideband sources,” IEEE Transactions on Signal Processing, vol. 43, pp. 2964-2974, Dec. 1995.
[32]M. Agrawal and S. Prasad, “Broadband DOA estimation using spatial-only modeling of array data,” IEEE Transactions on Signal Processing, vol. 48, pp. 663-670, Mar. 2000.
[33]J. H. Lee, Y. M. Chen, and C. C. Yeh, “A covariance approximation method for near-field direction finding using a uniform linear array,” IEEE Transactions on Signal Processing, vol. 43, pp. 1293-1298, May 1995.
[34]K. Buckley and L. Griffiths, “Broad-band signal-subspace spatial-spectrum (BASS-ALE) estimation,” IEEE Transactions on Acoustic, Speech, Signal Processing, vol. 36, pp. 953-964, July 1988.
[35]N. Strobel and R. Rabenstein, “Classification of time delay estimates for robust speaker localization,” IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, pp. 15-19, Mar. 1999.
[36]J. S. Hu, T. M. Su, C. C. Cheng, W. H. Liu, and T. I. Wu, “A self-calibrated speaker tracking system using both audio and video data,” IEEE Conference on Control Applications, vol.2, pp. 731-735, Sept. 2002.
[37]J. S. Hu, C. C. Cheng, W. H. Liu, and T. M. Su, “A speaker tracking system with distance estimation using microphone array,” IEEE/ASME International Conference on Advanced Manufacturing Technologies and Education, Aug. 2002.
[38]S. Mavandadi, P. Aarabi, “Multichannel nonlinear phase analysis for time-frequency data fusion,” Proceedings of the SPIE, Architectures, Algorithms, and Applications VII (AeroSense 2003), vol. 5099, pp. 222-231, Apr. 2003.
[39]P. Aarabi and S. Mavandadi, “Robust sound localization using conditional time–frequency histograms,” Information Fusion, vol. 4, pp. 111-122, June 2003.
[40]D. D. Rife and J. Vanderkooy, “Transfer-function measurement using maximum-length sequences,” Journal of Acoustical Society of America, vol. 37, no. 6, pp. 419-444, June 1989.
[41]W. M. Hartmann, “Localization of sound in rooms,” Journal of Acoustical Society of America, vol. 74, no. 5, pp. 1380-1391, Nov. 1983.
[42]H. Kuttruf, Room acoustics. London: Elsevier, 1991, chapter 3, pp. 56.
[43]T. Gustafsson, B. D. Rao, and M. Trivedi, “Source localization in reverberant environments: modeling and statistical analysis,” IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 791-803, Aug. 2003.
[44]B. G. Shinn-Cunningham, N. Kopcp and T. J. Martin, “Localizing nearby sound sources in a classroom: binaural room impulse response,” Journal of Acoustical Society of America, vol. 117, no. 5, pp. 3100-3115, May 2005.
[45]B. G. Shinn-Cunningham, “Localizing sound in rooms,” in Proceedings of the ACM SIGGRAPH and EUROGRAPHICS Campfire: Rendering for Virtual Environments, pp. 17-22, May 2001.
[46]J. Huang, N. Ohnishi, and N. Sugie, “Sound localization in reverberant environment based on the model of the precedence effect,” IEEE Transactions on Instrumentation and Measurement, vol. 46, no. 4, pp. 842-846, Aug. 1997.
[47]J. B. Allen and D. A. Berkley, “Image method for efficiently simulating small-room acoustics,” Journal of Acoustic Society America vol. 65, Issue 4, pp. 943-950, Apr. 1979.
[48]J. Nix and V. Hohmann, “Sound source localization in real sound fields based on empirical statistics of interaural parameters,” Journal of Acoustical Society of America, vol. 119, no. 1, pp. 463-479, Jan. 2006.
[49]P, Smaragdis and P. Boufounos, “Position and trajectory learning for microphone arrays,” IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 1, pp. 358-368, Jan. 2007.
[50]Y. H. Tsao, “Tests for nonstationarity,” Journal of Acoustic Society America, vol. 75, Issue 2, pp. 486-498, Feb. 1984.
[51]D. H. Friedman, “Estimation of formant parameters by sum-of-poles modeling,” in Proceedings of ICASSP, pp. 351-354, Apr. 1981.
[52]F. Casacuberta and E. Vidal, “A nonstationary model for the analysis of transient speech signals” IEEE Transactions Acoustic, Speech, and Signal Processing, vol. ASSP-35, no. 2, pp. 226-228, Feb. 1987.
[53]G. Xuan, W. Zhang, and P. Chai, “EM algorithms of Gaussian mixture model and hidden Markov model,” IEEE International Conference on Image Processing, pp. 145-148, Oct. 2001.
[54]J. B. MacQueen, “Some methods for classification and analysis of multivariate observations”, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1967.
[55]C. Elkan, “Using the triangle inequality to accelerate k-means,” Proceedings of the Twentieth International Conference on Machine Learning, pp. 147-153, 2003.
[56]J. S. Hu, W. H. Liu, and C. C. Cheng, “Indoor sound field feature matching for robot’s location and orientation detection,” submitted to Pattern Recognition Letters.
[57]J. Borenstein, H. R. Everett, and L. Feng, Navigating Mobile Robots: Sensors and Techniques, Wellesley, MA: A.K. Peters, 1996.
[58]A. Georgiev, and P. K. Allen, “Localization methods for a mobile robot in urban environments,” IEEE Transactions on Robotics, vol. 20, pp. 851-864, Oct. 2004.
[59]C. D. McGillem and T. S. Rappaport, “Infra-red location system for navigation of autonomous vehicles,” IEEE International Conference on Robotics and Automation, pp. 1236-1238, Apr. 1988.
[60]I. Ohya, A. Kosaka, and A. Kak, “Vision-based navigation by a mobile robot with obstacle avoidance using single-camera vision and ultrasonic sensing,” IEEE Transactions on Robotics and Automation, vol. 14, no. 6, pp. 969-978, Dec. 1998.
[61]J. M. Lee, K. Son, M. C. Lee, J. W. Choi, S. H. Han, and M. H. Lee, “Localization of a mobile robot using the image of a moving robot,” IEEE Transactions on Industrial Electronics, vol. 50, no. 3, pp. 612-619, June 2003.
[62]R. Gutierrez-Osuna, J. A. Janet, and R. C. Luo, “Modeling of ultrasonic range sensors for localization of autonomous mobile robots,” IEEE Transactions on Industrial Electronics, vol. 45, no. 4, pp. 654-662, Aug. 1998.
[63]U. Larsson, J. Frosberg, and A. Wernersson, “Mobile robot localization: integrating measurements from a time-of-flight laser,” IEEE Transactions on Industrial Electronics, vol. 43, no. 3, pp. 422-431, June 1996.
[64]A. M. Ladd, K. E. Bekris, A. P. Rudys, D. S. Wallach, and L. E. Kavraki, “On the feasibility of using wireless ethernet for indoor localization,” IEEE Transactions on Robotics and Automation, vol. 20, no. 3, pp. 555-559, June 2004.
[65]Q. H. Wang, T. Ivanov, and P. Aarabi, “Acoustic robot navigation using distributed microphone arrays,” Information Fusion, Information Fusion 5, vol. 5, pp. 131-140, June 2004.
[66]Y. Tamai, S. Kagami, H. Mizoguchi, Y. Amemiya, K. Nagashima and T. Takano, “Real-time 2 dimensional sound source localization by 128-channel huge microphone array,” IEEE International workshop on Robot and Human Interactive Communication, pp. 65-70, Sept. 2004.
[67]M. S. Brandstein and H. F. Silverman, “A robust method for speech signal time-delay estimation in reverberant rooms,” IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 375-378, Apr. 1997.
[68]C. L. Nikas and M. Shao, Signal Processing with Alpha-Stable Distributions and Applications. New York: Wiley, 1995.
[69]Y. Tamai, S. Kagami, H. Mizoguchi, Y. Amemiya, K. Nagashima, and T. Takano, “Sound spot generation by 128-channel surround speaker array,” IEEE International workshop on Sensor array and multichannel signal processing, pp. 542-546, July 2004.
[70]M. Yamada, N. Itsuki, and Y. Kinouchi, “Adaptive directivity control of speaker array,” Control, Automation, Robotics and Vision Conference, pp. 1143-1148, Dec. 2004.
[71]S. P. Parker, Acoustic Source Book, McGraw-Hill, 1988.
[72]M. Omura, M. Yada, H. Saruwatari, S. Kajata, K. Takeda, and F. Itakura, “Compensation of room acoustic transfer function affected by change of room temperature,” IEEE International Conference on Acoustic, Speech, and Signal Processing, pp. 941-944, Mar. 1999.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔