跳到主要內容

臺灣博碩士論文加值系統

(3.231.230.177) 您好!臺灣時間:2021/07/28 23:28
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:洪意婷
研究生(外文):Yi-Ting Hung
論文名稱:自連續語音辨識特定情緒之研究
論文名稱(外文):A Study on the Recognition of Specified Emotion from Continuous Speech Signals
指導教授:包蒼龍包蒼龍引用關係
指導教授(外文):Tsang-Long Pao
口試委員:包蒼龍
口試委員(外文):Tsang-Long Pao
口試日期:2015-07-17
學位類別:碩士
校院名稱:大同大學
系所名稱:資訊工程學系(所)
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:中文
論文頁數:52
中文關鍵詞:連續語音情緒辨識語音辨識
外文關鍵詞:Emotion RecognitionSpeech RecognitionContinuous Speech
相關次數:
  • 被引用被引用:1
  • 點閱點閱:264
  • 評分評分:
  • 下載下載:84
  • 收藏至我的研究室書目清單書目收藏:0
語音情緒辨識就是以電腦扮演日常生活中的聽者來自動判別語者的情緒。為了更貼近日常生活中人與人的對話,從自然的對話進行連續語音情緒辨識值得深入研究,自連續語音中辨識出特定情緒可以輔助電話客服系統、生命專線等作業以篩選出需要特別留意的情緒。
為了在實際應用上減少運算時間以提升效率,我們在辨識運算時減少使用的特徵向量並維持可接受的辨識率。本研究使用兩種不同的語料庫,分別為Berlin Database of Emotional Speech(Emo-DB)及Mandarin Chinese Emotional Corpus 2010 (MCEC2010),並從十三種特徵中選取兩種最具代表性的特徵,建立辨識特定情緒模型。接著將連續語音進行切割,利用深層類神經網路(Deep Neural Networks, DNN)作為分類器,並與前面推導出的特定情緒模型進行比對。實驗結果得到的平均辨識率為83.78%,另外使用對話式語句之平均辨識率為92.6%。
Speech emotion recognition is a process to recognize the emotion of the speaker from the uttered speech signal. To be more practical, it is necessary to use the natural dialogues as the training and testing corpus for continuous speech emotion recognition. Specific emotions recognitions from continuous speech can assist cell center systems, life-line and other telephone services. In order to reduce the computation time on practical applications and improve the efficiency, we reduce the number of features in the recognition process while maintain an acceptable recognition rate. In this research, we use two different cropora, Berlin Database of Emotional Speech(Emo-DB) and Mandarin Chinese Emotional Corpus 2010 (MCEC2010), and select the two most representative features form the thirteen features to establish specific emotion recognition models. In the experiment, we segment the continuous speech, then use the Deep Neural Networks(DNN) as classifier. The average recognition rate is 83.78%. In the experiment with conversational utterance in the MCEC2010 corpus database, the average recognition rate is 92.6%.
致謝 i
摘要 ii
ABSTRACT iii
目錄 iv
表目錄 vi
圖目錄 vii
第1章 緒論 1
1.1 前言 1
1.2 動機與目標 2
1.3 論文架構 3
第2章 文獻探討 4
2.1 情緒類別 4
2.2 連續語音切割 4
2.3 語音特徵 6
2.3.1 Formant, Shimmer及Jitter 6
2.3.2 Linear Predictive Coefficients (LPC)及Linear Prediction Cepstral Coefficients (LPCC) 7
2.3.3 Mel-Frequency Cepstral Coefficients及First derivative of MFCC (dMFCC)及Second derivative of MFCC (ddMFCC) 10
2.3.4 Log Frequency Power Coefficients 12
2.3.5 Perceptual Linear Prediction(PLP)及RelAtive SpecTrAl PLP (RastaPLP) 13
2.3.6 Log Energy及Zero Crossing Rate 16
2.4 特徵選取 16
2.5 分類器 17
2.6 語料庫 20
2.6.1 Berlin Database of Emotional Speech 21
2.6.2 Mandarin Chinese Emotional Corpus 2010 22
第3章 系統設計及架構 23
3.1 系統架構 23
3.2 連續語音切割 24
3.3 語音特徵及特徵選取 25
3.4 實驗結果評估 26
第4章 研究結果 28
4.1 分類器及語料庫 28
4.2 特定語音情緒特徵選取 30
4.3 連續語音辨識特定語音情緒 35
第5章 結論及未來展望 40
參考文獻 41
[1] T.-L. Pao, J.-H. Yeh and Y.-W. Tsai, "Recognition and analysis of emotion transition in mandarin speech signal.," 2010 IEEE International Conference on Systems Man and Cybernetics (SMC), pp. 3326-3332, 2010.
[2] X. Xu, Y. Li, X. Xu and Z. Wen, "Survey on discriminative feature selection for speech emotion recognition," 2014 9th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 345-349, 2014.
[3] Q. Zhang, N. An, K. Wang and F. Ren, "Speech emotion recognition using combination of features," 2013 Fourth International Conference on Intelligent Control and Information Processing (ICICIP), pp. 523 - 528, 2013.
[4] Z. Han, S. Lung and J. Wang, "A Study on Speech Emotion Recognition Based on CCBC and Neural Network," 2012 International Conference on Computer Science and Electronics Engineering (ICCSEE), pp. 144 - 147, 2012.
[5] J. Huang and B. Kingsbury, "Audio-visual Deep Learning for Noise Robust Speech Recognition," 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7596 - 7599, 2013.
[6] J.-S. Li, C.-C. Huang, S.-T. Sheu and M.-W. Lin, "Speech Emotion Recognition and its Applications," in 台灣感性學會研討會論文 , 2010.
[7] Ł. Juszkiewicz, "Improving speech emotion recognition system for a social robot with speaker recognition," 2014 19th International Conference On Methods and Models in Automation and Robotics (MMAR), pp. 921 - 925, 2014.
[8] S.-j. Wang, “A Study on the Emotion Analysis and Labeling of the Call Center Dialog Recordings,” Tatung University, 2014.
[9] K. Han, D. Yu and I. Tashev, "Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine," INTERSPEECH, pp. 223-227, 2014.
[10] S. HANDEL, "Classification of Emotions," 2011. [Online]. Available: http://www.theemotionmachine.com/classification-of-emotions. [Accessed 15 07 2015].
[11] D. Costa, G. Lopes, C. Mello and H. Viana, "Speech and phoneme segmentation under noisy environment through spectrogram image analysis," 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1017-1022, 2010.
[12] M. Farrús, J. Hernando and P. Ejarque, "Jitter and Shimmer Measurements for Speaker Recognition," Eurospeech, 2007.
[13] P. Vaidyanathan, The Theory of Linear Prediction, Morgan and Claypool, 2008.
[14] Y.-T. Chen, A Study of Emotion Recognition on Mandarin Speech and its Performance Evaluation, PhD Disseriation, Tatung University, 2008.
[15] "Mel Frequency Cepstral Coefficient (MFCC) tutorial," [Online]. Available: http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/. [Accessed 15 07 2015].
[16] A. Nogueiras, A. Moreno, A. Bonafonte and J. B. Mariño, "Speech Emotion Recognition Using Hidden Markov Models," Proceedings Eurospeech 2001, pp. 2679-2682.
[17] H. Hermansky and N. Morgan, "Rasta processing of speech," IEEE Trans. Speech Audio Processing, vol. 2, no. 4, pp. 578 -589, 1994.


[18] J. Sima, "Introduction to neural networks," ICS CAS, Prague, 1998.
[19] Z. Xiao, E. Dellandrea, W. Dou and L. Chen, "Features extraction and selection for emotional speech classification," IEEE Conference onAdvanced Video and Signal Based Surveillance, pp. 411-416, 2005.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top