(3.236.222.124) 您好!臺灣時間:2021/05/08 07:45
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:黎亞媞
研究生(外文):Whenty Ariyanti
論文名稱:論文題目集成和多模態學習用於病理性語音分類
論文名稱(外文):ENSEMBLE AND MULTIMODAL LEARNING FOR PATHOLOGICAL VOICE CLASSIFICATION
指導教授:王家慶曹昱曹昱引用關係
指導教授(外文):Jia-Ching WangYu Tsao
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:63
中文關鍵詞:病理性語音聲學信號集成學習二進制分類
外文關鍵詞:Pathological VoiceAcoustic SignalEnsemble LearningBinary Classification
相關次數:
  • 被引用被引用:0
  • 點閱點閱:37
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
語音障礙是現代社會中最常見的醫學疾病之一,特別是對於有職業語音需求的人群。 在本文中,我們研究了一種通過組合聲信號和病歷對病理性語音障礙進行分類的堆疊式集成學習方法。 在提出的集成學習框架中,堆疊支持向量機(SVM)形成了一組弱分類器,並為元學習者提供了一個深度神經網絡(DNN)。 基於DNN的高度複雜性,將聲學特徵和病歷結合起來以獲得更好的分類性能。 與單個SVM和DNN分類器相比,具有更好的性能,並且具有顯著的優勢。
Voice disorders are one of the most common medical diseases in modern society, especially for those with occupational voice demand. In this paper, we investigate a stacked ensemble learning method to classify pathological voice disorder by combining acoustic signals and medical records. In the proposed ensemble learning framework, a stacked support vector machine (SVM) form a set of weak classifiers and a deep neural network (DNN) for a meta learner. Based on the high complexity of DNN, acoustic features and medical records are combined to attain better classification performance. The better performance than single SVM and DNN classifiers with a notable margin.
摘要 i
ABSTRACT ii
ACKNOWLEDGEMENT iii
TABLE OF CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES vii
LIST OF ABBREVATIONS viii
CHAPTER 1 INTRODUCTION 1
CHAPTER 2 REVIEW OF LITERATURE 3
2.1. Pathological Voice Disorders 4
2.1.1 Classification of Voice Disorders 5
2.2. Support Vector Machine Classifier (SVM) 6
2.2.1 Multiclass Support Vector Machine 9
2.3. Deep Neural Network (DNN) 12
2.4. Ensemble Learning 15
2.4.1 Ensemble Learning Process 17
CHAPTER 3 IDENTIFICATION STRATEGIES 23
3.1 Overview 23
3.1 Classifier Design 25
3.2 Pre-Processor 27
3.2.1 Transformation 27
3.2.2 Feature Extraction 28
3.2.3 Normalization 32
3.2.4 Splitting 33
3.3 Performance Measures 33
CHAPTER 4 ENSEMBLE AND MULTIMODAL FOR PATHOLOGICAL VOICE CLASIFICATION 35
4.1 Overview of Dataset 35
4.1.1 Acoustic Signals 35
4.1.2 Medical Records 37
4.2 System Implementation 41
4.2.1 Pre-processing 41
4.2.2 Experiment Design 41
4.3 Experiment Results 43
4.3.1 Single Feature Results 43
4.3.2 Ensemble and Multimodal Learning Results 44
CHAPTER 5 CONCLUSIONS 46
5.1 Accomplishments 46
5.2 Limitations 46
5.3 Future Research Directions 47
BIBLIOGRAPHY 48
[1] S.-H. Fang., C-T. Wang., J-Y. Chen., Y. Tsao., F-C. Lin., “Combining acoustic signals and medical records to improve pathological voice classification,” in APSIPA Transaction on Signal and Information Processing, 2019.
[2] S. R. Schwartz., S. M. Cohen., S. H. Dailey., R. M. Rosenfeld., E. S. Deutsch., M. B. Gillespie., E. Granieri., E. R. Hapner., C. E., Kimball., H. J. Krouse et al., “Clinical practice guideline: hoarness (dysphonia),” in Otolaryngology-Head and Neck Surgery, vol. 141, pp.1-31, 2009.
[3] Vaziri. G., Almasganj. F., Behroozmand. R., “Pathological assessment of patients speech signals using nonlinear dynamical analysis,” in Computers in Biology and Medicine, vol.40(1), pp.128-134, 2006.
[4] S. R. Savithri., “Clinical voice evaluation,” http://docplayer,.net/53758736-Clinical-voice-evaluation.html, (Date last accessed March 20, 2020)
[5] H. Kasuya., S. Ogawa., Y. Kikuchi,. And S. Ebihara., “An acoustic analysis of pathological voice and its application to the evaluation of laryngeal pathology,” in Speech Communication, vol.5, no.2, pp.171-181, 1986.
[6] C. Maguire., P. d. Chazal., R. B. Reilly., and P. D. Lacy., “Identification of voice pathology using automated speech analysis,” in Third International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2003.
[7] R. Herbrich, “Learning Kernel Classifiers: Theory and Algorithm,” in MIT Press, 2002
[8] R. Schapire., Y. Singer., “Improved boosting algorithms using confidencerated predictions,” in COLT, 1998
[9] E. Allwein., R. Schapire., Y. Singer., “Reducing multiclass to binary: a unifying approach for margin classifiers,” in Journal of Machine Learning Research, pp. 113-141, 2000.
[10] H. Schwenk., “Using boosting to improve HMM/neural network speech recognizer,” in Acoustic, Speech and Signal Processing (ICASSP), pp. 1009-12. 1999.
[11] G. Zweig., “Boosting Gaussian mixture in an LVCSR system,” in Acoustic, Speech and Signal Processing (ICASSP), pp. 1527-30, 2000
[12] T. Dietterich., G. Bakhiri., “Solving multiclass learning, boosting and error-correcting codes,” in COLT, pp.145-155, 1999
[13] D. Yu and L. Deng, Automatic Speech Recognition in Springer Handbook of Signals and Communication Technology, Springer (Chapter 1), 2015
[14] J. Li and L. Deng, Robust Automatic Speech Recognition in Springer Handbook of a
Bridge of Practical Applicants, Springer (Chapter 2), 2016
[15] Roy, N., Merrill, R. M., Thibeault, S., Parsa, R. A., Gray, S. D., & Smith, E. M (2004). Prevalence of voice disorders in teachers and the general population. J Speech Lang Hear Res., 47(2), 281-93
[16] M. Bansal., "Diseases of ear, nose, & throat", in Jaypee Brothers Medical Publisher, 2013.
[17] Vapnik. V, Cortes. C, "Support Vector Network", in Machine Learning, 20, 273-297
[18] M. Mohammed, M.B. Khan and E.B.M. Bashier, Machine Learning: Algorithms and
Applications, CRC Press, Boca Raton, (2017), 115–126
[19] Corinna Cortes and Vladimir Vapnik, Support-Vector Networks, Machine Learning, (1995), 273–297.
[20] Chih-W Hsu and Chih-J Lin, A Comparison of Methods for Multi-class Support Vector Machines, IEEE Transactions on Neural Networks 13, (2002), 415–425.
[21] Dymitr Ruta and Bogdan Gabrys. Classifier selection for majority voting. Information fusion, 6(1):63–81, 2005.
[22] D. Yu and L. Deng, Automatic Speech Recognition in Springer Handbook of Signals and Communication Technology, Springer (Chapter 4), 2015.
[23] P. Werbos. Beyond regression: New tools for prediction and analysis in the behavior science. PhD thesis, Harvard University, Cambridge, MA, 1974.
[24] João Mendes-Moreira, Carlos Soares, Alípio Mário Jorge, and Jorge Freire De Sousa. Ensemble approaches for regression. Volume 45(1). ACM, 2012, pages 1–40. ISBN: 3512250815. DOI: 10.1145/2379776.2379786.
[25] Alexander Strehl and Joydeep Ghosh. Cluster Ensembles — a Knowledge Reuse Framework for Combining Multiple Partitions. J. mach. learn. res., 3:583–617, March 2003. ISSN: 1532-4435. DOI: 10.1162/153244303321897735.
[26] Thomas G Dietterich. Ensemble Methods in Machine Learning. First international workshop on multiple classifier systems, 1857:1–15, 1990.
[27] Lars Kai Hansen and Peter Salamon. Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence, 12(10):993–1001, 1990.
[28] Robi Polikar. Ensemble learning. In, Ensemble machine learning, pages 1–34. Springer US, Boston, MA, 2012.
[29] Li. H., Kinnuen T., “An overview of text-independent speaker recognition: from features to super vectors,” in Speech Communication, pp.12-40.
[30] DAVIS, S., MERMELSTEIN, P. “Comparison of parametric representations for
monosyllabic word recognition in continuously spoken sentences”, IEEE Transactions on Acoustics, Speech, and Signal Processing, v. 28, n. 4, pp. 357–366, August 1980.
[31] J.I. Godino-Liorente., P. Gomez Vilda., and M. Blanco-Velasco., “Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and shortterm cepstral parameter,” in IEEE Transactions on Biomedical Engineering, vol.53, no.10, pp.1943-1953.
[32] S.-H. Fang, Y. Tsao, M.-J. Hsiao, J.-Y. Chen, Y.-H. Lai, F.-C. Lin, and C.-T. Wang, “Detection of pathological voice using cepstrum vectors: A deep learning approach,” in Journal of Voice, pp.634-641, 2019.
[33] JUANG, B.-H., RABINER, L. R., WILPON, J. G. “On the use of band pass littering in speech recognition”, IEEE Transactions on Acoustics, Speech, and Signal Processing, v. 35, n. 7, pp. 947–954, July 1987.
[34] D. Zhang., D. Gatcia-Perez., S. Bengio and I. McCowan., “Semi-supervised adapted HMMs for unusual event detection,” in IEEE Comp Society Conference, vol.1, pp.611-618, 2005.
[35] Fukunaga, Keinosuke, and Patrenahalli M. Narendra. “A branch and bound algorithm for computing k-nearest neighbors.” IEEE Transactions on Computers 100.7 (1975): 750-753.
[36] Friedl, Mark A., and Carla E. Brodley. “Decision tree classification of land cover from remotely sensed data.” Remote Sensing of Environment 61.3 (1997): 399-409.
[37] Varma, Manik, and Bodla Rakesh Babu. “More generality in efficient multiple kernel learning.” Proceedings of the 26th Annual International Conference on Machine Learning. ACM, (2009).
[38] Fawcett, Tom. “An introduction to ROC analysis.” Pattern Recognition letters 27.8 (2006): 861-874.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔