臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.213) 您好！臺灣時間：2025/11/09 23:38

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

曾憲泓

研究生(外文):

Tseng, Xian-Hong

論文名稱:

利用LSTM演算法基於自閉症診斷觀察量表訪談建置辨識自閉症小孩之評估系統

論文名稱(外文):

Using LSTM algorithm to establish an Evaluation System for Child with Autistic Disorder during Autism Diagnostic Observation Schedule Interview

指導教授:

李祈均

指導教授(外文):

Lee, Chi-Chun

口試委員:

冀泰石、曹昱、江振宇

口試委員(外文):

Chi, Tai-Shih、Tsao, Yu、Chiang, Chen-Yu

口試日期:

2017-12-06

學位類別:

碩士

校院名稱:

國立清華大學

系所名稱:

電機工程學系所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2017

畢業學年度:

106

語文別:

中文

論文頁數:

中文關鍵詞:

泛自閉症障礙、自閉症診斷觀察量表、長短時記憶、人類行為訊號處理、多模態行為

外文關鍵詞:

autism spectrum disorder、autism diagnostic observation schedule、long short-term memory、behavioral signal processing、multimodal behaviors

相關次數:

被引用:1
點閱:653
評分:
下載:0
書目收藏:1

泛自閉症障礙是一個高度流行的神經發育障礙的疾病，在許多醫學研究中指出其特徵有在社交互動時有困難、溝通障礙、和有限定且重複性的行為問題，導致人們與自閉症患者互動時會覺得其行為相當怪異，從泛自閉症障礙者的綜合行為徵狀表現又能區分為三種類型如：典型自閉症(AD)、亞斯伯格症(AS)、及高功能性自閉症(HFA)，且這些自閉症的症狀程度及類型會經由專業醫生透過臨床觀察、輔助診斷工具後才確診，其中自閉症診斷觀察量表(ADOS)就為該領域的黃金輔助診斷標準。然而現今評量自閉症的方式有人為因素、耗時、且不易擴展的問題，本論文為實現建置辨識自閉症小孩的評估系統，透過在ADOS互動式訪談過程中的說故事片段，藉由聲音及影像的多模態行為訊號，利用LSTM演算法及機器學習模型進行建模，應用於人類行為訊號處理的概念實作，由實驗結果呈現在辨識自閉症及自閉症三種類型的辨識能力可以達到不錯的效果，此外在辨識自閉症三種類型上與使用研究助理在ADOS中評量的分數相比我們的系統效果也較為出色，透過本論文的實驗都呈現自閉症的評估系統是相當有可行性的，藉由醫療以及工程跨領域的結合，期望能夠讓自閉症診斷過程可以進一步俱備一致性、可複製性及客觀性。
關鍵字：泛自閉症障礙、自閉症診斷觀察量表、長短時記憶、人類行為訊號處理、多模態行為

Autism spectrum disorder (ASD) is a highly-prevalent neuraldevelopmental disorder. In medical research often characterized by social communicative deficits and restricted repetitive interest. The heterogeneous nature of ASD in its behavior manifestations encompasses broad syndromes such as, Classical Autism (AD), Asperger syndrome (AS), and High functioning Autism (HFA). To evaluate the degree and there syndromes in ASD, doctor will diagnose through clinical observation and auxiliary diagnostic tools, one of them is Autism Diagnostic Observation Schedule (ADOS), i.e., a gold standard diagnostic tool. However, there are existing some problems in diagnosis of autism such as, subjective evaluation, non-scalable, and time-consuming. In this work, we design an automatic assessment system based on computing multimodal behavior features, including acoustic characteristic、body movements of the participant, using LSTM algorithm and machine learning technique to build model during ADOS story-telling part by behavioral signal processing (BSP) concept. Further, our behavior-based measurement achieve competitive, sometimes exceeding, recognition accuracies in discriminating between three syndromes of ASD when compare to investigator’s clinical-rating on participant during ADOS.
Keywords: autism spectrum disorder, autism diagnostic observation schedule, long short-term memory, behavioral signal processing (BSP), multimodal behaviors

誌謝......................................i
中文摘要..................................ii
Abstract.................................iii
目錄......................................iv
圖目錄....................................vi
表目錄....................................vii
第一章緒論.................................1
第二章資料庫介紹...........................4
第三章研究方法.............................7
3.1 聲音特徵...............................8
3.1.1. VAD................................8
3.1.2. 人聲辨別............................9
3.1.3. 特徵擷取............................10
3.2 影像特徵...............................12
3.2.1. Action Energy......................12
3.3 Long Short-Term Memory................15
3.4 Functional............................20
3.5 Baseline..............................21
3.5.1. 詞袋模型編碼........................21
3.5.2. 費雪向量編碼........................21
3.6 分類器之學習預測模型....................24
3.7 Student t-test........................25
第四章實驗設計與結果.......................26
4.1 實驗細節說明...........................26
4.2 實驗一 : TD vs ASD.....................30
4.3 實驗二 : AD vs AS vs HFA...............33
4.4 實驗三 : Statistical Analysis..........36
第五章結論與未來發展.......................40
參考文獻...................................42

[1] S. Narayanan and P. G. Georgiou, "Behavioral signal processing: Deriving human behavioral informatics from speech and language," Proceedings of the IEEE, vol. 101, pp. 1203-1233, 2013.
[2] S.-W. Hsiao, H.-C. Sun, M.-C. Hsieh, M.-H. Tsai, H.-C. Lin, and C.-C. Lee, "A multimodal approach for automatic assessment of school principals' oral presentation during pre-service training program," in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
[3] M. P. Black, A. Katsamanis, B. R. Baucom, C.-C. Lee, A. C. Lammert, A. Christensen, et al., "Toward automating a human behavioral coding system for married couples’ interactions using speech acoustic features," Speech Communication, vol. 55, pp. 1-21, 2013.
[4] H.-Y. Chen, Y.-H. Liao, H.-T. Jan, L.-W. Kuo, and C.-C. Lee, "A Gaussian mixture regression approach toward modeling the affective dynamics between acoustically-derived vocal arousal score (VC-AS) and internal brain fMRI bold signal response," in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, 2016, pp. 5775-5779.
[5] W.-C. Chen, P.-T. Lai, Y. Tsao, and C.-C. Lee, "Multimodal arousal rating using unsupervised fusion technique," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015, pp. 5296-5300.
[6] A. Metallinou, Z. Yang, C.-c. Lee, C. Busso, S. Carnicke, and S. Narayanan, "The USC CreativeIT database of multimodal dyadic interactions: From speech and full body motion capture to continuous emotional annotations," Language resources and evaluation, vol. 50, pp. 497-521, 2016.
[7] F.-S. Tsai, Y.-L. Hsu, W.-C. Chen, Y.-M. Weng, C.-J. Ng, and C.-C. Lee, "Toward Development and Evaluation of Pain Level-Rating Scale for Emergency Triage based on Vocal Characteristics and Facial Expressions," in INTERSPEECH, 2016, pp. 92-96.
[8] D. Bone, C.-C. Lee, A. Potamianos, and S. S. Narayanan, "An investigation of vocal arousal dynamics in child-psychologist interactions using synchrony measures and a conversation-based model," in Fifteenth Annual Conference of the International Speech Communication Association, 2014.
[9] E. Delaherche, M. Chetouani, F. Bigouret, J. Xavier, M. Plaza, and D. Cohen, "Assessment of the communicative and coordination skills of children with autism spectrum disorders and typically developing children using social signal processing," Research in Autism Spectrum Disorders, vol. 7, pp. 741-756, 2013.
[10] D. Bone, C.-C. Lee, M. P. Black, M. E. Williams, S. Lee, P. Levitt, et al., "The psychologist as an interlocutor in autism spectrum disorder assessment: Insights from a study of spontaneous prosody," Journal of Speech, Language, and Hearing Research, vol. 57, pp. 1162-1177, 2014.
[11] R. L. Spitzer and J. B. Williams, "Diagnostic and statistical manual of mental disorders," in American Psychiatric Association, 1980.
[12] E. K. Delinicolas and R. L. Young, "Joint attention, language, social relating, and stereotypical behaviours in children with autistic disorder," Autism, vol. 11, pp. 425-436, 2007.
[13] J. Baio, "Prevalence of Autism Spectrum Disorders: Autism and Developmental Disabilities Monitoring Network, 14 Sites, United States, 2008. Morbidity and Mortality Weekly Report. Surveillance Summaries. Volume 61, Number 3," Centers for Disease Control and Prevention, 2012.
[14] C. Lord, S. Risi, L. Lambrecht, E. H. Cook, B. L. Leventhal, P. C. DiLavore, et al., "The Autism Diagnostic Observation Schedule—Generic: A standard measure of social and communication deficits associated with the spectrum of autism," Journal of autism and developmental disorders, vol. 30, pp. 205-223, 2000.
[15] C. Lord, M. Rutter, and A. Le Couteur, "Autism Diagnostic Interview-Revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders," Journal of autism and developmental disorders, vol. 24, pp. 659-685, 1994.
[16] P. Boersma and D. Weenink, "Praat-A system for doing phonetics by computer [Computer Software]," The Netherlands: Institute of Phonetic Sciences, University of Amsterdam, 2003.
[17] R. Paul, A. Augustyn, A. Klin, and F. R. Volkmar, "Perception and production of prosody by speakers with autism spectrum disorders," Journal of autism and developmental disorders, vol. 35, pp. 205-220, 2005.
[18] D. Ververidis and C. Kotropoulos, "Emotional speech recognition: Resources, features, and methods," Speech communication, vol. 48, pp. 1162-1181, 2006.
[19] B. McFee, C. Raffel, D. Liang, D. P. Ellis, M. McVicar, E. Battenberg, et al., "librosa: Audio and music signal analysis in python," in Proceedings of the 14th python in science conference, 2015, pp. 18-25.
[20] P. Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," in Proceedings of the institute of phonetic sciences, 1993, pp. 97-110.
[21] M. K. Sönmez, L. Heck, M. Weintraub, E. Shriberg, M. Kemal, S. Larry, et al., "A lognormal tied mixture model of pitch for prosody-based speaker recognition," 1997.
[22] M. Farrús, "Jitter and shimmer measurements for speaker recognition," in 8th Annual Conference of the International Speech Communication Association; 2007 Aug. 27-31; Antwerp (Belgium).[place unknown]: ISCA; 2007. p. 778-81., 2007.
[23] H. Wang, A. Kläser, C. Schmid, and C.-L. Liu, "Action recognition by dense trajectories," in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, 2011, pp. 3169-3176.
[24] H. Wang and C. Schmid, "Action recognition with improved trajectories," in Proceedings of the IEEE international conference on computer vision, 2013, pp. 3551-3558.
[25] A. Tamrakar, S. Ali, Q. Yu, J. Liu, O. Javed, A. Divakaran, et al., "Evaluation of low-level features and their combinations for complex event detection in open source videos," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp. 3681-3688.
[26] J. Sun, Y. Mu, S. Yan, and L.-F. Cheong, "Activity recognition using dense long-duration trajectories," in Multimedia and Expo (ICME), 2010 IEEE International Conference on, 2010, pp. 322-327.
[27] L. Baraldi, F. Paci, G. Serra, L. Benini, and R. Cucchiara, "Gesture recognition in ego-centric videos using dense trajectories and hand segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014, pp. 688-693.
[28] F. Rosenblatt, "The perceptron: A probabilistic model for information storage and organization in the brain," Psychological review, vol. 65, p. 386, 1958.
[29] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Cognitive modeling, vol. 5, p. 1, 1988.
[30] G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," science, vol. 313, pp. 504-507, 2006.
[31] R. J. Williams and D. Zipser, "A learning algorithm for continually running fully recurrent neural networks," Neural computation, vol. 1, pp. 270-280, 1989.
[32] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, pp. 1735-1780, 1997.
[33] Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencies with gradient descent is difficult," IEEE transactions on neural networks, vol. 5, pp. 157-166, 1994.
[34] S. Petridis and M. Pantic, "Deep complementary bottleneck features for visual speech recognition," in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, 2016, pp. 2304-2308.
[35] J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, "Image classification with the fisher vector: Theory and practice," International journal of computer vision, vol. 105, pp. 222-245, 2013.
[36] C. Cortes and V. Vapnik, "Support vector machine," Machine learning, vol. 20, pp. 273-297, 1995.
[37] B. Waske and J. A. Benediktsson, "Fusion of support vector machines for classification of multisensor data," IEEE Transactions on Geoscience and Remote Sensing, vol. 45, pp. 3858-3866, 2007.
[38] D. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	針對實體化交談介面開發基於行為衡量方法於自閉症小孩之評估系統

無相關期刊

1.	利用多模態模型混合CNN和LSTM影音特徵以自動化偵測急診病患疼痛程度
2.	以雙向長短期記憶網路架構混和多時間粒度文字模態改善婚姻治療自動化行為評分系統
3.	透過表演逐字稿之互動特徵以改善中文戲劇表演資料庫情緒辨識系統
4.	基於大腦靜息態迴旋積自編碼的fMRI特徵擷取器
5.	利用聯合因素分析研究大腦磁振神經影像之時間效應以改善情緒辨識系統
6.	利用長短期記憶網絡之遺忘閘提取語音及文字流暢度特徵用以改善自閉症孩童說故事自動辨識系統
7.	雙重互補聲學嵌入網絡: 從原始波形挖掘與特徵集相異的語音情感識別特徵
8.	一個多模態連續情緒辨識系統與其應用於全域情感辨識之研究
9.	整合文本多層次表達與嵌入演講屬性之表徵學習於強健候用校長演講自動化評分系統
10.	針對實體化交談介面開發基於行為衡量方法於自閉症小孩之評估系統
11.	利用個人與情境差異改善語音情緒辨識系統
12.	異質性表徵鑲嵌網絡對當代多媒體與主流文化之分析研究
13.	結合fMRI之迴旋積類神經網路多層次特徵用以改善語音情緒辨識系統
14.	圖像對話聲學神經網絡：在預測任務績效時對人格與聲學行為之間的小組內和小組間影響建模
15.	從受眾反應探討多媒體對社群與個人影響: 電影票房預測和個人化情緒分析

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室