臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.106) 您好！臺灣時間：2026/04/04 22:39

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

黃炬智

研究生(外文):

Chu-Chih Huang

論文名稱:

適用於分類中文構音障礙之深度學習模型

論文名稱(外文):

Classification of Chinese Articulation Disorder based on Deep Learning Model

指導教授:

阮聖彰

指導教授(外文):

Shanq-Jang Ruan

口試委員:

郭柏齡、湯梓辰

口試委員(外文):

Po-Ling Kuo、Tzu-Chen Tang

口試日期:

2019-08-15

學位類別:

碩士

校院名稱:

國立臺灣科技大學

系所名稱:

電子工程系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2019

畢業學年度:

107

語文別:

中文

論文頁數:

中文關鍵詞:

構音障礙、深度學習、卷積神經網路、LeNet-5

外文關鍵詞:

Articulation disorder、Deep Learning、Convolutional neural network、LeNet-5

相關次數:

被引用:0
點閱:465
評分:
下載:0
書目收藏:0

構音障礙為發音過程中發生錯誤或困難，導致咬字不正確進而造成語句不夠清晰，而構音障礙一直是常見的兒童語言問題，目前在台灣醫療界對於構音障礙的類別並沒有統一說法，所以一般醫院的治療方式都必須有一位語言治療師來進行判斷及治療，治療師會針對每種構音障礙類別所缺乏的發音去設計一系列的單字並讓孩童跟著念，經由一連串的單字發音後治療師會根據該孩童的發音狀況去下判斷，並持續幾個月的回診來改善發音問題，但這樣的治療方式造成構音孩童只能在醫院才能接受治療並得到回饋，造成治療周期拉長，本論文目的是結合最新的AI卷積神經網路（CNN）自動診斷構音障礙。結果顯示，LeNet-5在模型最小的情況下達到94.56 Top-1準確度和0.995平均F1-score使得更適合在移動裝置上執行構音障礙的分類。

Articulation disorder means having difficulties during pronunciations, leading to incorrect articulations and unclear sentences. Articulation disorder has been a common child language issue. Currently, there is no any unified sayings for articulation disorder's classification in the Taiwan's medical field. Thus, a speech therapist is required for analysis and treatment in hospitals. After a series of pronunciations, a speech therapist will make an analysis based on children's pronunciations. Children will return to the hospitals for months continuously to improve their conditions. Nevertheless, the treatment can only benefit children with articulation disorder by receiving treatments in hospitals, slowing down the treatment cycle. The purpose of this work is to automate the diagnosis for articulation disorder by combining the latest AI's convolutional neural network (CNN). Results show that LeNet-5 which achieved 94.56 Top-1 accuracy and 0.995 avg F1-score with the smallest model size is more suitable to apply articulations disorder application on mobile devices.

Table of Contents
Recommendation Form
Committee Form
Chinese Abstract
English Abstract
Acknowledgements
Table of Contents
List of Tables
List of Figures
Introduction
Related Works
Proposed Method
Experimental Results
Conclusions

[1] JingWei Huang, YeouJiunn Chen, “Development of Articulation Diagnostic and Teaching Activities System for Articulation Disorders ,”, Department of Electrical Engineering Southern Taiwan University of Science and Technology. 2007
[2] 林寶貴（1994）。聽覺障礙教育與復健。台北：五南。ISBN13： 9789571108803
[3] National Institute on Deafness and Other Communication Disorders (NIDCD) (1994), National Strategic research Plan, Bethesda, MD: Department of Health and Human Services.
[4] YuNan Wang, MeiLi Cheng, YaWen Li, XiaoJun Zhang, “Result of Low Frequency Speech Therapy in Children with Articulation Disorder,” in Taiwan Journal of Physical Medicine and Rehabilitation, 38(1)‧2734, 2010.
[5] Fox, Cynthia; Ramig, Lorraine; Ciucci, Michelle; Sapir, Shimon; McFarland, David; Farley, Becky, “Neural PlasticityPrincipled Approach to Treating Individuals with Parkinson Disease and Other Neurological Disorders,” in Seminars in Speech and Language 27 (4), 283–99. Doi: 10.1055/s2006955118.
[6] The National Collaborating Centre for Chronic Conditions, ed., “Other key interventions,” in Parkinson’s Disease. London: Royal College of Physicians, pp. 135–46, 2006.
[7] S. Witt and S. Young„ “Phonelevel pronunciation scoring and assessment for interactive language learning,” Speech Communication, vol. 30, no. 2–3, pp. 95–108, 2000.
[8] F. Zhang, C. Huang, F. K. Soong, M. Chu, and R. H. Wang, “Automatic mispronunciation detection for Mandarin,” in Proc. ICASSP, pp. 5077–5080, 2008. [9] Y.B. Wang and L.S. Lee, “Improved approaches of modeling and detecting error patterns with empirical analysis for computeraided pronunciation training,” in Proc. ICASSP, pp. 5049–5052, 2012.
[10] O. AbdelHamid, A. Mohamed, H. Jiang, and G. Penn, “Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition,” in Proc. ICASSP, pp. 4277–4280, 2012.
[11] P. D. Polur and G. E. Miller, “Experiments with fast Fourier transform, linear predictive and cepstral coefficients in dysarthric speech recognition algorithms using hidden Markov model,” in IEEE Trans. Neural Syst. Rehabil. Eng., vol. 13, no. 4, pp. 558–561, Dec. 2005.
[12] H. V. Sharma and M. HasegawaJohnson, “Statetransition interpolation and MAP adaptation for HMMbased dysarthric speech recognition,” in Proc. NAACL HLT Workshop Speech Lang. Process. Assist. Tech., Jun. 2010, pp. 29–72.
[13] X. Huang, A. Acero, and H.W. Hon, “Spoken Language Processing: A Guide to Theory,” Algorithm and System Development. Englewood Cliffs, JN, USA: PrenticeHall, 2001.
[14] Asha.org, “Selected Phonological Processes,” https:// www.asha.org/ PracticePortal/ ClinicalTopics/ SpeechSoundDisordersArticulationandPhonology/ SelectedPhonologicalProcesses/, July 16, 2019
[15] BaiiJia Yang, ShiangJiun Lai, and WenLing Liao, “Patterns of Dyslalia in Mandarin Speakers,” Taiwan Journal of Physical Medicine and Rehabilitation, 1984, pp. 3543.
[16] Shin WonHo, Yang TaeYoung, “Speech Recognition Using Noise Robust Features and Spectral Subtraction,” the journal of the acoustical society of Korea, vol. 19, no. 2, pp. 3843, 1969
[17] Nitsch B. H, “A Frequencyselective stepfactor control for an adaptive filter algorithm working in the frequency domain,” Signal processing the official publication of the European Association for Signal Processing, vol. 80, no. 9, pp. 17331745, 2000
[18] Liu QG, Champagne. B, Ho D.K.C, “Simple design of oversampled uniform DFT filter banks with applications to subband acoustic echo cancellation,” Signal processing the official publication if the European Association for Signal Processing, vol. 80, no.5, pp.831,847, 2000.
[19] H. Franco, L. Neumeyer, M. Ramos, and H. Bratt, “Automatic detection of phonelevel mispronunciation for language learning,” in Proc. Eurospeech, pp. 851–854, 1999.
[20] YaoChi Hsu, MingHan Yang, HsiaoTsung Hung, Yuwen Hsiung, YaoTing Sung, and Berlin Chen, “Exploring Combinations of Various Deep Neural Network based Acoustic Models and Classification Techniques for Mandarin Mispronunciation Detection,” The 2015 Conference on Computational Linguistics and Speech Processing ROCLING 2015, pp. 103120.
[21] YowBang Wang, LinShan Lee, “Error Pattern Detection Integrating Generative and Discriminative Learning for ComputerAided Pronunciation Training,” INTERSPEECH 2012: 819822
[22] YowBang Wang, LinShan Lee, “Improved approaches of modeling and detecting Error Patterns with empirical analysis for ComputerAided Pronunciation Training,” ICASSP 2012: 50495052
[23] YowBang Wang, LinShan Lee, “Supervised Detection and Unsupervised Discovery of Pronunciation Error Patterns for ComputerAssisted Language Learning,” IEEE/ACM Trans. Audio, Speech & Language Processing 23(3): 564579 (2015)
[24] T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, “Deep convolutional neural networks for LVCSR,” in Proc. ICASSP, pp. 8614–8618, 2013.
[25] T. O＇Shea, J. Corgan, and T. Clancy, “Convolutional radio modulation recognition networks,” in Proc. International conference on engineering applications of neural networks, 2016.
[26] Alex Graves, Abdelrahman Mohamed and Geoffrey Hinton, “SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.
[27] T. N. Sainath, O. Vinyals, A. W. Senior, and H. Sak, “Convolutional, long shortterm memory, fully connected deep neural networks,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
[28] Braun, Stefan, Neil, Daniel, and Liu, ShihChii. “A curriculum learning method for improved noise robustness in automatic speech recognition.” arXiv preprint arXiv: 1606.06864, 2016.
[29] Zaidi Razak,Noor Jamilah Ibrahim,emran mo mil,mohd Yamani Idna Idris, Mohd yaakob Yusoff, “Quranic verse recition feature extraction using mel frequency cepstral coefficient (MFCC),”
[30] PrenticeHall, Englewood Cliffs, NJ, L.R.Rabiner and R. W.Schafer, “Digital Processing of Speech Signals,” 1987
[31] A. Vijayan, B. M. Mathai, K. Valsalan, R. R. Johnson, L. R. Mathew, and K. Gopakumar, “Throat microphone speech recognition using mfcc,” in Networks and Advances in Computational Technologies (NetACT), 2017 International Conference on, 2017, pp. 392–395.
[32] Y. Wang dan B. Lawlor, “Speaker recognition based on MFCC and BP neural networks,” in Signals and Systems Conference (ISSC), 2017 28th Irish, 2017, pp. 1– 4.
[33] M. N. Aulia, M. S. Mubarok, W. U. Novia, dan F. Nhita, “A comparative study of MFCCKNN and LPCKNN for hijaiyyah letters pronounciation classification system,” in Networks and Advances in Computational Technologies (NetACT), in Information and Communication Technology (ICoIC7), 2017 5th International Conference on, 2017, pp. 1–5.
[34] U. G. Patil, S. D. Shirbahadurkar, and A. N. Paithane, “Automatic Speech Recognition of isolated words in Hindi language using MFCC,” in Computing, Analytics and Security Trends (CAST), International Conference on, 2016, pp. 433–438.
[35] K. Lee and H. Hon, “Speakerindependent phone recognition using hidden Markov models,” in IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, No. 11, Nov. 1989.
[36] L.R. Rabiner,H. Niemann. M. Lang and G. Sagerer, “Mathematical foundations of hidden Markov models,” in Speech Understanding and Dialog Systems, Vol. F46 of NATO ASI Series, Springer, Berlin. 1988, pp. 183205.
[37] L. R. Rabiner, B. H. Juang, “An Introduction to Hidden Markov Models,” in IEEE ASSP Magazine, Jan. 1986.
[38] A. B. Poritz, “Hidden Markov Models: A Guided Tour,” in ICASSP 1988.
[39] S. E. Levinson, L. R. Rabiner, M. M. Sondhi, “An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition,” in The Bell System Technical Journal, vol. 62, no. 4, April 1983.
[40] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, november 1998.
[41] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., and Citro, “Tensorflow: Largescale machine learning on heterogeneous distributed systems,” in arXiv preprint arXiv:1603.04467, 2016.
[42] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, “Faster RCNN: Towards RealTime Object Detection with Region Proposal Networks,” in arXiv: 1506.01497v3 [cs.CV] 6 Jan 2016.
[43] Karen Simonyan, Andrew Zisserman, “Very Deep Convolutional Networks for Largescale Image Recognition,” in Proc. ICLR 2015.
[44] C. Goutte and É. Gaussier, “A Probabilistic Interpretation of Precision, Recall and FScore, with Implication for Evaluation,” ECIR, 2005.

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	卷積神經網路在金融技術指標之應用
2.	基於深度學習之音樂片段人聲辨識
3.	開放環境下之車牌偵測
4.	基於深度學習類神經網路磁振造影自動化識別-以腮腺腫瘤為例
5.	基於深度學習之天候影像分類
6.	基於卷積神經網路之非平衡式陶瓷基板瑕疵檢測模型
7.	結合語意關鍵詞與卷積神經網路之文本分類研究
8.	結合技術指標與卷積網路於股市交易之研究
9.	基於稀疏矩陣影像強化和深度學習之目標檢測技術
10.	卷積神經網路影像辨識系統架構設計
11.	卷積神經網路應用於中文字手寫風格辨識
12.	基於深度學習之靜態影像超解析度技術
13.	先進卷積式神經網路應用於深度學習及影像通用分類
14.	基於深度學習網路架構之物件偵測運算加速
15.	應用雙向長短期記憶神經網路於新聞分類

無相關期刊

1.	基於FPGA之深度卷積神經網路之高能效威諾格拉德一維最小化濾波器演算法
2.	結合陀螺儀之智慧監視器層間相對位移量測技術研發
3.	利用深度卷積神經網路在透過對比度增強的胸部X光二值化影像上進行肺部分割方法
4.	一種適用於邊緣運算架構之基於深度神經網路模型在USB傳輸上的深度編碼方法
5.	一種有效應用於監控影像的動態卷積混合剪枝架構
6.	基於系統晶片平台之即時非接觸式脈搏律監控系統
7.	基於深度神經網絡的靜態電壓衰退預測
8.	深度卷積類神經網路之電腦斷層影像切割
9.	實現在SoC−FPGA 之低複雜度影像資料不連續性偵測方法
10.	低成本無線多通道表面肌電訊號量測系統之實現
11.	基於表面肌電訊號下肢異常辨識之3D-CLDNN深度神經網路架構
12.	Cu3Ge奈米線電性暨Cu3Ge-Ge異質奈米線製備之研究
13.	雙波長自校式溫度感測循跡微粒之研發
14.	感應馬達之損壞軸承電流及振動訊號檢測與預測模型之建立
15.	開發一適用於細胞分選之掃流式過濾晶片

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室