臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.217.137) 您好！臺灣時間：2026/05/06 15:09

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

林季暉

研究生(外文):

Chi-hui Lin

論文名稱:

以類神經網路深度學習的語音增強方法

論文名稱(外文):

A Modified Deep Neural Network Speech Enhancement Model

指導教授:

王益文

口試委員:

王壘、陳德生

口試日期:

2016-07-18

學位類別:

碩士

校院名稱:

逢甲大學

系所名稱:

資訊工程學系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2016

畢業學年度:

104

語文別:

中文

論文頁數:

中文關鍵詞:

深度類神經網路、語音增強、彈性傳播演算法

外文關鍵詞:

Deep Neural Networks、Speech Enhancement、Resilient Propagation

相關次數:

被引用:1
點閱:475
評分:
下載:95
書目收藏:1

由於現代的移動式裝置對於語音品質的要求日益提升，因此增強語音清晰度的演算法是眾人所急需的。本論文開發了基於深度學習多層結構之類神經網路的語音增強系統，以Y. Xu 論文為基礎，並且修改DNNs的架構和MLPs的學習演算法，使DNNs的學習更有效率。本論文的DNNs架構，由多個RBMs展開後所組成的一個深層的MLPs，並於其後再加上一層MLP；此最後一層MLP，neuron的activation function皆為linear，且其weights初始化為單位矩陣；訓練此DNNs的演算法，捨棄傳統的back-propagation，並以resilient propagation取代之。本論文已經以多個實驗學習NOIZEUS的語音資料集驗證此三個調整可以加速DNNs的學習。分析了語音資料的以下四種特性領域，噪音強度、噪音種類、語句、性別，並且找到了各領域的關鍵特性及其中的特性之間的相互關係，最終得以縮小訓練的集合，減輕學習上的負擔。最終為了可有效的控制DNN語音學習模型的參數，分析了DNNs的學習結果以及學習後增強的語音品質之間的關係。

Speech intelegibility is more essential than before due to more and more mobile devices need to improve speech quality. This paper develops a more efficient deep neural network (DNN) speech enhancement model based upon Y. Xu’s DNN speech enhancement model, but the structure of DNNs and the learning algorithm of MLPs are modified to achieve an efficient DNN learning. A deep MLP neural net is composed by unrolling a stack of RBMs and adding on a layer to the last stage of the MLP. In the last layer, each neuron has a linear activation function with initial identity weights. Instead of the back-propagation, the resilient propagation learning is employed to train our DNN. These three modifications can speed up DNNs’ learning, that is verified with several experiments using NOIZEUS speech dataset. The key characteristics and mutual relationships among noise intensities, noise types, sentences and human gender, are also identified to reduce the size of training dataset. In order to effectively control parameters of our DNN speech enhancement model, correlations between learning results of DNNs and qualities of enhanced speech are analyzed.

誌謝 i
摘要 ii
Abstract iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章緒論 1
1.1 研究動機 1
1.2 論文架構 2
第二章NEURAL NETWORKS的相關研究 3
2.1 MULTILAYER PERCEPTRONS AND GRADIENT DECENT ALGORITHM 4
2.1.1 Resilient Propagation Algorithm 7
2.2 RESTRICTED BOLTZMANN MACHINE AND CONTRASTIVE DIVERGENCE LEARNING 8
第三章研究方法 12
3.1基於DNNS的語音增強學習系統 13
3.1.1 Feature Extraction 14
3.1.2 Waveform Reconstruction 15
3.2 DEEP NEURAL NETWORKS MODEL 16
3.2.1 Pretraining DNNs with Noisy Features 17
3.2.2 Training DNNs with Noisy Features and Clean Features 17
3.3 語音增強的系統 19
第四章實驗 20
4.1 實作細節 20
4.2 修改的DNNS學習效率被改善的驗證 22
4.2.1 Gradient Decent Algorithm的改良 22
4.2.2 Initial Final Weights of MLPs的改良 23
4.2.3 DNNs架構的改善 23
4.3 不同的學習語音集對於學習效果的影響 25
4.3.1 噪音強度對於語音學習的影響 27
4.3.2 噪音種類對於語音學習的影響 29
4.3.3 字句對於語音學習的影響 31
4.3.4人對於語音學習的影響 34
4.4 DNNS的學習效果與語音增強的關係 35
4.4.1 DNNs學習Magnitude資訊後對於語音的增強效果 36
4.4.2 假設DNNs學習完整的頻域資訊後對於語音的增強效果 40
第五章結論 44
參考文獻 46

[1]D. Burshtein and S. Gannot, “Speech enhancement using a mixture-maximum model,” IEEE Trans. on Speech and Audio Processing, Vol.10, No.6, pp.341-351, 2002.
[2]L. Bahl, P. Brown, P. de Souza, and R. Mercer, “Maximum mutual information estimation of hidden Markov model parameters for speech recognition,” Proceedings of the ICASSP, pp. 49–52, 1986.
[3]Y. Bengio, “Learning deep architectures for AI,” Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1–127, 2009.
[4]I. Cohen and S. Gannot, “Spectral enhancement methods,” Springer Handbook of Speech Processing, J. Benesty, M. M. Sondhi, and Y. Huang, Eds. Berlin, Germany: Springer, pp. 873–901, 2008.
[5]L. Deng, “Computational models for speech production,” Computational Models of Speech Pattern Processing, pp. 199–213. Springer-Verlag, New York, 1999
[6]D. Griffin and J. S. Lim, “Signal estimation from modified shorttime Fourier transform,” IEEE Trans. on ASSP, Vol.32, No.2, pp.236-243, 1984.
[7]Simon Haykin, Feedforward Neural Networks: An Introduction, Wiley, 1998.
[8]G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Computation, 14(8), 1711–1800, 2002.
[9]G. E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, pp. 1527–1554, 2006.
[10]G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006.
[11]G. E. Hinton, L. Deng, D. Yu, and G. E. Dahl, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82–97, 2012.
[12]H. Hirsch, and D. Pearce. “The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions,” ISCA ITRW ASR2000, Paris, France, September 18-20, 2000.
[13]J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proceedings of the National Academy of Sciences, 79:2554-2558, 1982.
[14]Bong-Ki Lee and Joon-Hyuk Chang, “Packet Loss Concealment Based on Deep Neural Networks for Digital Speech Transmission,”　IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 24, NO. 2, FEBRUARY 2016.
[15]P.C. Loizou, “Speech Enhancement: Theory and Practice,” Taylor and Francis, 2007.
[16]Y. Hu, P. C. Loizou, "Subjective comparison and evaluation of speech enhancement algorithms," Speech communication 49 (7) (2007) 588-601.
[17]Martin Riedmiller and Heinrich Braun. “A direct adaptive method for faster backpropagation learning: The RPROP algorithm,” Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, April 1993.
[18]IEEE Subcommittee, “IEEE Recommended Practice for Speech Quality Measurements,” IEEE Trans. Audio and Electroacoustics, AU-17(3), 225-246, 1969.
[19]J. Tchorz and B. Kollmeier, “SNR estimation based on amplitude modulation analysis with applications to noise suppression,” IEEE Trans. Speech Audio Process, vol. 11, no. 3, pp. 184–192, May 2003.
[20]Paul J. Werbos, “Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences,” PhD thesis, Harvard University, 1974.
[21]S. I. Tamura, “An analysis of a noise reduction neural network,” Proc. ICASSP, 1989, pp. 2001–2004.
[22]E. A. Wan and A. T. Nelson, “Networks for speech enhancement,” Handbook of Neural Networks for Speech Processing, S. Katagiri, Ed. Norwell, MA, USA: Artech House, 1998.
[23]Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, “An experimental study on speech enhancement based on deep neural networks,” IEEE Signal Process. Lett., vol. 21, no. 1, pp. 65–68, Jan. 2014.

電子全文

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	使用多元特徵參數於前饋式深度學習神經網路之語音增強系統
2.	基於深度神經網路之多聲道聲源方位估計與語音增強
3.	波束合成神經網路之語音增強研究

無相關期刊

1.	深度學習網路之研究及其應用
2.	深度學習行動雲端計算平台
3.	以深度學習完成基於彩色及深度影像的人臉辨識
4.	基於深度學習之性別辨識與人員計數
5.	基於深度學習之AAC壓縮域翻唱歌快速檢索
6.	基於深度學習之靜態影像超解析度技術
7.	深度學習應用於以影像辨識為基礎的個人化推薦系統-以服飾樣式為例
8.	基於深度學習之人臉特徵與屬性識別
9.	藉由深度學習預測商品類別再結合細部外觀以及局部組態之商品辨識
10.	具有深度學習精神之人類手勢影像辨識系統
11.	深度學習模型於時尚風格視覺元素之探索
12.	基於交通流量預測的監督深度學習
13.	深度學習在飛航數據分析上之應用
14.	以深度學習和時頻訊號進行心率量測
15.	基於深度學習之 RGB-D 視覺辨識系統

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室