(3.238.235.155) 您好!臺灣時間:2021/05/11 02:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:李政庭
研究生(外文):LI,CHENG-TING
論文名稱:削波失真下之語者識別的效能改善
論文名稱(外文):Performance Improvement of Speaker Identification under Clipping Distortion
指導教授:蔡偉和蔡偉和引用關係
指導教授(外文):TSAI,WEI-HO
口試委員:王家慶江振宇黃士嘉
口試委員(外文):WANG,JIA-CHINGCHIANG,CHEN-YUHUANG,SHIH-CHIA
口試日期:2016-07-18
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電子工程系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
畢業學年度:104
語文別:中文
中文關鍵詞:語者識別、削波失真、遞迴式類神經網路、高斯混合模型
外文關鍵詞:speaker identification、Clipped distortion、Recurrent neural network、gaussian mixture model
相關次數:
  • 被引用被引用:1
  • 點閱點閱:62
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
錄音可能說話者離麥克風太近音量過大導致量化飽和,或稱削波失真,這不僅聲音聽起來十分不悅耳,更容易造成自動辨識系統的誤判。本論文探討削波失真對於以高斯混合模型為基礎之語者識別系統的影響,並提出改善方法,基本改善構想是求取已知有失真訊號與無失真訊號之間的頻譜關係,透過遞迴式類神經網路將兩者間的關係參數化,使得未知之有失真訊號可經由類神經網路轉換而成為無失真訊號,達到改善的目標,經實驗結果驗證,當訊號包含50%削波失真時,遞迴式類神經網路的處理方法可明顯改善語者識別正確率從21%提升至86%。
Clipping distortion arises from the input exceeding the maximum or minimum quantizer output during a sound is recorded and digitized, i.e., quantization saturation. It is not only unpleasant to hear but also detrimental to an automatic sound recognition system. This study investigates the effect of clipping distortion for a speaker identification system based on Gaussian mixture models, and further proposes a solution to improve the speaker identification under clipping distortion. The basic idea is to find the spectral relationship between a signal with and without clipping distortion using a recurrent neural network. The spectral relationship can then be used to transform a distorted signal into undistorted signal. Our experiments show that under 50% signal samples with clipping distortion in sound recordings, the accuracy of speaker identification can be improved from 21% to 86% after the transformation of the recurrent neural network.
目錄

摘 要 I
ABSTRACT II
誌謝 III
目錄 IV
表目錄 VI
圖目錄 VII
第一章 緒論 1
1.1 介紹 1
1.2 文獻探討 2
1.2.1 語者識別 2
1.2.2 訊號修復技術背景 2
1.3 論文結構 3
第二章 問題定義及語料庫設計 4
2.1 問題定義 4
2.2 實驗語料庫設計 5
2.3 削波失真對語者識別之影響 8
2.4 辨識流程 9
2.5 特徵參數擷取 10
2.5.1 預強調(Pre-emphasis) 11
2.5.2 音框化(Framing) 11
2.5.3 漢明窗(Hamming Window) 11
2.5.4 快速傅立葉轉換(Fast Fourier Transform, FFT) 12
2.5.5 梅爾刻度三角帶通濾波器 12
2.6 語者模型建立 14
2.6.1 高斯混合模型 14
2.6.2 語者模型訓練流程 16
2.6.3 向量量化 17
2.6.4 EM演算法 19
2.7 語者識別 20
第三章 研究方法 22
3.1 系統架構 22
3.2 類神經網路 23
3.2.1 遞迴式神經網路(Recurrent Neural Network) 24
第四章 實驗結果與評估 29
4.1 基於高斯混合模型的語者識別結果評估 30
4.1.1 實驗一 削波失真音檔在高斯混合模型下的辨識 31
4.1.2 實驗二 使用Adobe Audition修復使用高斯混合模型辨識 31
4.1.3 實驗三 訊號經RNN修復使用高斯混合模型辨識 33
第五章 結論與未來展望 37
5.1 結論 37
5.2 未來展望 38
參考文獻 39
參考文獻
[1].Shin Miura and Hirofumi Nakajima, “Restoration of Clipped Audio Signal Using Recursive Vector Projection,” in Proc. 3rd Int. Conf. Music Information Retrieval ,Paris ,France, 2002, pp.164-169.
[2].Shin Miura1;2, Hirofumi Nakajima3, Shigeki Miyabe1, Shoji Makino1 Takeshi Yamada1,Kazuhiro Nakadai2,” Restoration of Clipped Audio Signal Using Recursive Vector Projection”, IEEE TENCON, November 2011
[3].Wei-Ho Tsai and Kun-Tien Chen, "An Evaluation of Speaker Identification for Clipped Speech,"International Symposium on Technology for Sustainability, 2014.
[4].D. Sova, C. Radhakrishnan, W.K. Jenkins, A.D. Salvia, “Fault tolerant transform domain adaptive noise Canceling from Corrupted Speech Signals,” Circuits and Systems (MWSCAS), 2012 IEEE 55th International Midwest Symposium on, pp. 880–882, 2012.
[5].Wang Yanlei, Zhao Heming, G.X. and G.C., “A Study on Speaker and Session Variability in Speaker Recognition of Chinese Whispered Speech,”International Conference in Industrial Mechastronics and Automation,”2010,pp.292-295
[6].D. A. Reynolds and R. C. Rose, “Robust text-independent speaker identification using Gaussian mixture speaker models,” IEEE Trans. Speech Audio Process., vol.3, pp. 72-83, 1995.
[7].Yi-Long Wan, Tian-Qi Zhang, Zhi-Chao Wang, Jing Jin, “Robust speech recognition based on multi-band spectral subtraction,” Image and Signal Processing (CISP), 2013 6th International Congress on, vol. 1, 2013.
[8].L.S. Liu, X.F. Peng, “Diagonal recurrent neural network with output feedback and its application,” Computer Science & Education (ICCSE), 2011 6th International Conference on, pp. 286–288, 2011.
[9].T. K. Moon, “The Expectation-Maximization Algorithm,” IEEE Signal Processing Magazine, vol. 13, no. 6, pp. 47-60, November 1996.
[10].Dawei ZHANG, Changchun BAO, Feng DENG, Bingyin XIA, Hao CHEN “A restoration method of the clipped audio signals based on MDCT”,IEEE International Symposium on Signal Processing and Information Technology( ISSPIT) ,pp253-257,2011
[11].Yi-Long Wan, Tian-Qi Zhang, Zhi-Chao Wang, Jing Jin, “Robust speech recognition based on multi-band spectral subtraction,” Image and Signal Processing(CISP), 2013 6th International Congress on, vol. 1,pp.36-40, 2013
電子全文 電子全文(網際網路公開日期:20210809)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔