跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.19) 您好!臺灣時間:2025/09/01 20:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林裕凱
論文名稱:階層式的人聲分類與鼾聲聲學特性分析中的特徵篩選
論文名稱(外文):Feature Selection in Hierarchical Classification of Human Sounds and Acoustic Analysis of Snoring Signals
指導教授:廖文宏廖文宏引用關係
學位類別:碩士
校院名稱:國立政治大學
系所名稱:資訊科學學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:中文
論文頁數:74
中文關鍵詞:人聲分類聲學特徵篩選
相關次數:
  • 被引用被引用:1
  • 點閱點閱:318
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
人聲大致上可分為語音和非語音兩部分。傳統上對於聲音分類的研究大多強調語音、音樂和環境聲的分類。在本論文中,我們採取不同的觀點,著重於人聲中非語音部份的研究,聲音種類為笑聲、尖叫聲、打噴嚏聲和鼾聲。為了達到此目標,我們調查常用的幾種聲學特徵,並以多元適應性雲形迴歸和支持向量機進行特徵值篩選,找出對於非語音人聲分類具有代表性的聲學特徵。此外我們也進行多方面的模擬,以觀察雜訊對辨識率的影響。
本論文第二部份為鼾聲研究,我們比較一般普通麥克風和目前醫療用鼾聲麥克風(snoring microphone)、壓電感應器(piezo sensor)三者在偵測鼾聲上的表現。此外,並以KL divergence 和EMD兩種計算差異度的方法進行普通鼾聲與阻塞型鼾聲的分群。同樣地,我們加入不同程度雜訊至鼾聲訊號中,以測試兩方法抗雜訊的穩健度,結果顯示此兩種方法均有不錯的表現,其中EMD在大多數情況下有較佳的結果。
Human sounds can be roughly divided into two categories: speech and non-speech. Traditional audio scene analysis research puts more emphasis on the classification of audio signals into human speech, music, and environmental sounds. We take a different perspective in this thesis. We are mainly interested in the analysis of non-speech human sounds, including laugh, scream, sneeze, and snore. Toward this goal, we investigate many commonly used acoustic features and select useful ones for classification using multivariate adaptive regression splines (MARS) and support vector machine (SVM). To evaluate the robustness of the selected features, we also perform extensive simulations to observe the effect of noise on the accuracy of the classification.
目錄

第一章 緒論 1
1.1 研究背景與目的 1
1.2 相關研究 5
1.2.1 人聲分類 5
1.2.2 鼾聲研究 7
1.3 論文架構 8
第二章 聲學特徵分析 9
2.1 基頻 (Fundamental Frequency) 9
2.2 頻譜質心 (Spectral Centroid) 12
2.3 頻譜分散度 (Spectral Spread) 14
2.4 頻譜平坦度 (Spectral Flatness) 15
2.5 熵 (Entropy) 16
2.6 共振峰 (Formant Frequency) 18
2.7 梅爾倒頻譜係數(Mel-Scale Frequency Cepstral Coefficients, MFCC) 22
第三章 分類器 25
3.1 多元適應性雲形迴歸 (Multivariate Adaptive Regression Splines, MARS) 25
3.2 支持向量機 (Support Vector Machine, SVM) 28
3.2.1 線性可分離 (Linear Separable Patterns) 29
3.2.2 非線性分離 (Non Linear Separable Patterns) 31
第四章 人聲分類 34
4.1 聲學特徵值篩選 34
4.2 雜訊對於分類的影響 42
第五章 鼾聲研究 45
5.1 鼾聲的聲學特徵 45
5.2 鼾聲與生理訊號 46
5.3 鼾聲檢測儀器比較 48
5.3.1 訊號端點偵測 49
5.3.2 聲音訊號和振動訊號 52
5.4 鼾聲的分群 57
5.4.1 KL divergence 59
5.4.2 雜訊對於KL divergence之影響 62
5.4.3 Earth Mover’s Distance 64
5.4.4 雜訊對於EMD之影響 67
第六章 結論 69
參考文獻 71
參考文獻

[1] Y. Su, “Analysis and Classification of Human Sounds,” Master’s thesis, Department of Computer Science National Chengchi University, 2006.
[2] W. Stoltzman,“Toward a Social Signaling Framework: Activity and Emphasis in Speech,” Master’s thesis, Engineering in Electrical Engineering and Computer Science Massachusetts Institute of Technology, 2006.
[3] 陳若涵,許肇凌,張智星,羅鳳珠,「以音樂內容為基礎的情緒分析與辨識」,第二屆電腦音樂與音訊技術研討會,Taipei,Taiwan,2006.
[4] M.Pantic and L.J.M. Rothkrantz, “Toward an affect-sensitive multimodal human-computer interaction,” Proceedings of the IEEE, Vol.91, Issue 9, pp.1370 – 1390, 2003.
[5] Z. Xin and Z. Ras, “Analysis of Sound Features for Music Timbre Recognition,” International Conference on Multimedia and Ubiquitous Engineering, 2007.
[6] J. Wang, J. Wang, K. He and C. Hsu, “Environmental Sound Classification using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level Descriptor,” International Joint Conference on Neural Networks, 2006.
[7] D. Deng, C. Simmermacher and S. Cranefield,“Finding the Right Features for Instrument Classification of Classical Music,”Integrating AI and Data Mining, pp.34 – 41, 2006.
[8] R. Jarina and J. Olajec,“Discriminative Feature Selection for Applause Sounds Detection,”Image Analysis for Multimedia Interactive Services, Vol., Issue 6-8, pp.13 – 16, 2007.
[9] V. A. Petrushin, “Emotion Recognition in Speech Signal: Experimental Study, Development, and Application,” Proceedings of the Sixth International Conference on Spoken Language Processing, 2000.
[10] J. Rong, Y. Chen, M. Chowdhury and G. Li, “Acoustic Features Extraction for Emotion Recognition,” 6th IEEE/ACIS International Conference on Computer and Information Science, pp. 419-424, 2007.
[11] J. J. Lien et al, “Automated Facial Expression Recognition,” Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 390-395, 1998.
[12] K. Mase, “Recognition of Facial Expression from Optical Flow,” IEICE Transactions, Vol. E74, No.10, pp. 3474-3483, 1991.
[13] C. Cheng and Y. Hung, “Visual/Acoustic Emotion Recognition,” IEEE International Conference on Multimedia and Expo, 2005.
[14] Y. Hsu, M. Chen, C. Cheng and C. Wu, “Development of a portable device for home monitoring of snoring,” Journal of Biomedical Engineering - Applications, Basis & Communications, Vol. 17, No. 4, pp.176-180, 2005.
[15] J. Sola-Soler, R. Jane, J.A. Fiz and J. Morera,“Automatic classification of subjects with and without Sleep Apnea through snoring analysis,”Engineering in Medicine and Biology Society, Vol. , Issue 22-26, pp.6093 -6096, 2007.
[16] M. Cavusoglu, M. Kamasak, O. Erogul, T. Ciloglu, Y. Serinagaoglu and T. Akcam, “An efficient method for snore/nonsnore classification of sleep sounds,” Physiological Measurement, Vol. 28, No. 8, pp. 841-853, 2007.
[17] R. J. Baken, “Clinical Measurement of Speech and Voice. London : Taylor and Francis,” 1987.
[18] X. Huang, A. Acero and H. Hon, “Phonetics and Phonology,” Spoken Language Processing: A Guide to Theory, Algorithm and System Development, pp. 39, 2001.
[19] J. H. Friedman, “Multivariate Adaptive Regression Splines,” Department of Statistics, Stanford University, Technical Report 102 Rev, 1990.
[20] 李天行, 唐筱菁,「整合財務比率與智慧資本於企業危機診斷模式之建構-類神經網路與多元適應性雲形迴歸之應用」,資訊管理學報,11卷2期,2004年4月。
[21] C. Burges,“A Tutorial on Support Vector Machines for Pattern Recognition,”Data Mining and Knowledge Discovery 2:121 - 167, 1998.
[22] 王小川,「語音訊號處理」,全華股份有限公司,2007年4月。
[23] 張智星,「音訊處理與辨識」, http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/ [retrieved July 2008].
[24] X. Lin , H. Peng and B. Liu,“Support Vector Machines for Text Categorization in Chinese Question Classification,” IEEE/WIC/ACM International Conference on Web Intelligence, pp. 334-337, 2006.
[25] B. Ma, N. Nguyen and J. Rajapakse,“Gene .classification using codon usage analysis and support vector machines,”IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2007.
[26] Y. Yang, R. Wang, Y. Liu, S. Li and X. Zhou,“Solving P2P Traffic Identification Problems Via Optimized Support Vector Machines,”IEEE/ACS International Conference on Computer Systems and Applications, pp. 165-171, 2007.
[27] H.T. Lin and C.J. Lin,“A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods,”Technical report, Department of Computer Science & Information Engineering, National Taiwan University, 2003.
[28] 譚慶鼎,「鼾聲如雷,傷的是誰?談打鼾與阻塞型睡眠呼吸中止症候群」,
http://w3.mc.ntu.edu.tw/department/ent/tan/tan93-1.doc [retrieved July 2008]
[29] 陳濘宏,「阻塞性睡眠呼吸中止症候群」,
http://www.cgmh.org.tw/sleepcenterlnk/scolumn/20070101-4.html [retrieved July 2008]
[30] 劉勝義,「臨床睡眠檢查學」,合記出版社,民國93年10月。
[31] Roche Seminars on Aging: Aging in Sleep, Zepelin, 1982.
[32] Y. Rubner, C. Tomasi and L. J. Guibas,“A Metric for Distributions with Applications to Image Databases,”Proceedings of the IEEE International Conference on Computer Vision, Bombay, India, pp.59-66, 1998.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top