(3.237.97.64) 您好!臺灣時間:2021/03/05 03:57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:游子璇
研究生(外文):Tzu-Hsuen Yiu
論文名稱:應用支援向量機、K個最鄰近法與邏輯斯迴歸於醫療診斷
論文名稱(外文):Application of Support Vector Machine, K-Nearest Neighbor and Logistic Regression on Medical Diagnosis
指導教授:黃美玲黃美玲引用關係
指導教授(外文):Mei-Ling Huang
學位類別:碩士
校院名稱:國立勤益科技大學
系所名稱:工業工程與管理系
學門:工程學門
學類:工業工程學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:中文
論文頁數:53
中文關鍵詞:預防醫學醫療診斷特徵選取支援向量機K最鄰近法邏輯斯迴歸
外文關鍵詞:Preventive MedicineMedical DiagnosisFeature SelectionSupport Vector MachineK-Nearest NeighborLogistic Regression
相關次數:
  • 被引用被引用:4
  • 點閱點閱:400
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:58
  • 收藏至我的研究室書目清單書目收藏:0
有效的預測模型能從龐大資料量的醫療資料庫中歸納出可被接受的結果,並提供醫師診斷時之參考,就預防醫學(Preventive Medicine)的角度而言,能提供病患適當的建議、衛教內容,甚至進行預防性的治療,進而降低疾病的發生率。本研究旨在探討應用特徵選取結合不同型態之機器學習模型,分析醫療診斷資料,以從中擷取所需資訊,同時評估新模型之效能。
本研究之資料來源為UCI Machine Learning Repository,選用資料庫為Parkinson(帕金森氏症)及Dermatology(皮膚病)資料庫。本研究利用CART進行特徵選取將原始資料篩選,以做為三種分類器之輸入項目。所用的三種分類器分別為:支援向量機、K個最鄰近法與邏輯斯迴歸,其分類績效以準確率做為評估。CART特徵選取方法,將Parkinson資料庫22項變數篩選為15項;Dermatology資料庫34項變數篩選為24項。三個分類器之準確率:Parkinson資料庫分別為92.82%、95.38%、89.84%;Dermatology資料庫分別為98.60%、97.49%、97.49%。
藉由此研究,若將分類技術,提供醫師作為診斷時的輔助參考,或是患者就醫前的自我檢核依據,將可以避免不必要的醫療資源浪費,以提升醫療服務的品質。
Using the model in a large medical database to predict and provide medical diagnosis of reference. According to Preventive Medicine perspective, it can provide patients with appropriate advice, health education content and even for preventive treatment. Hope to reduce the incidence of the diseases.
In the study, the databases sourced the UCI Machine Learning Repository. The databases were inclusive Parkinson databases and Dermatology database. This search uses feature selection to extract important features and reports a comparative study of three methods on Parkinson and Dermatology Database. We established three different classifiers, Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Logistic Regression. The accuracy of three classifiers: Parkinson database were 98.60%, 97.49%, 97.49%; Dermatology database were 98.60%, 97.49%, 97.49%, respectively. We compared the results with the related research. Our proposed model is very effective. The result helps to decrease the medical examination time and cost.
摘要 i
Abstract ii
致謝 iii
目錄 iv
表目錄 vi
圖目錄 vii
一、緒論 1
1.1研究背景與動機 1
1.2研究目的 2
1.3研究範圍與對象 2
1.4研究流程 2
1.5研究架構 4
二、文獻探討 5
2.1 Parkinson 5
2.1.1 Parkinson症狀 5
2.1.2 Parkinson病情發展 6
2.1.3 Parkinson盛行率 6
2.1.4 Parkinson資料庫之文獻彙整 7
2.2 Dermatology 7
2.2.1 Dermatology常見之皮膚病簡介 8
2.2.3 Dermatology資料庫之文獻彙整 10
2.3資料探勘 11
2.4特徵選取 12
2.4.1 Wrapper模式 14
2.4.2 Filter模式 15
2.4.3決策樹 16
2.4.4決策樹之應用 18
2.5支援向量機 19
2.5.1 線性支援向量機 19
2.5.2 非線性支援向量機 21
2.5.3 SVM之應用 22
2.6 K個最鄰近法 23
2.6.1 K個最鄰近法之應用 25
2.7 邏輯斯迴歸 26
2.7.1邏輯斯迴歸之應用 28
2.8 績效評估 29
三、研究方法 31
3.1資料說明 32
3.2建構SVM分類器 36
3.3建構KNN分類器 37
3.4建構邏輯斯迴歸分類器 38
3.5設備說明 38
四、研究結果與討論 39
4.1 Parkinson 39
4.1.1特徵選取結果 39
4.1.2分類器執行結果 41
4.1.3研究結果比較 42
4.2 Dermatology 42
4.2.1特徵選取結果 42
4.2.2分類器執行結果 45
4.2.3研究結果比較 46
五、結論與建議 48
參考文獻 50
[1] 丁一賢、陳牧言,2006,資料探勘,滄海圖書出版股份有限公司。
[2] 中國醫藥大學附設醫院,神經部,http://www.cmuh.org.tw/HTML/dept/1810/edu/parkinson.htm。
[3] 內政部統計處,2010,內正統計通報,九九年第四週,98年底人口結構分析。
[4] 王濟川,郭志剛,2003,「Logistic 回歸模型-方法及應用」,五南圖書出版股份有限公司。
[5] 行政院衛生署衛生統計資訊網,2010,http://www.doh.gov.tw/CHT2006/DM/DM2_2_p02.aspx?class_no=440&now_fod_list_no=11468&level_no=1&doc_no=77184
[6] 江坤林,2006,資料探勘於個人信用貸款審核之應用,國立台灣科技大學資訊工程系,碩士論文。
[7] 李良修,1999,走過帕金森幽谷,天下遠見出版股份有限公司。
[8] 香港柏金遜症會,http://www.hkpdf.org.hk/info_01c.html.
[9] 唐文政,2004,基因演算法應用於約略集合理論之屬性化簡及屬性質離散化,華梵大學資訊管理學系,碩士論文。
[10] 張云濤、龔玲,2007,資料探勘原理與技術,五南圖書出版股份有限公司。
[11] 張家維,2009,一個使用心率變異辨識充血性心力衰竭和心房纖維顫動的方法,國立中正大學電機工程碩士班,碩士論文。
[12] 張智星,資料群聚與樣式辨認,http://neural.cs.nthu.edu.tw/jang/books。
[13] 陳秉誼,2003,應用遺傳演算法與決策樹於化妝品行銷研究,國立成功大學工業管理研究所,碩士論文。
[14] 陳威羽,2008,混和式二元分類器之建構與應用,國立勤益科技大學工業工程與管理系碩士班,碩士論文。
[15] 陳智凱,2008,應用約略集合於分類鑑別之研究,國立勤益科技大學工業工程與管理系碩士班,碩士論文。
[16] 黃立維,2008,結合資料採礦技術於建構建築業財務危機預警與關鍵因素分析模型之研究,國立雲林科技大學,碩士論文。
[17] 維基百科,http://zh.wikipedia.org/zh/%E5%B95%E9%87%91%E6%A3% AE%E6%B0%8F%E7%97%87m。
[18] 廖述賢、溫志皓,2009,資料採礦與商業智慧,雙葉書廊有限公司。
[19] 廖健雄,2008,劉氏轉換法支撐向量機應用於皮膚病的分類,亞洲大學生物資訊學系,碩士論文。
[20] 劉利生、劉國輝,2008,帕金森病與老年痴呆症,陝西科學技術出版社。
[21] 簡禎富、林國勝,2006,建構cDNA生物晶片之二元資料挖礦模式及其實證研究,資訊管理學報,Vol.13,pp. 133-159。
[22] 羅文華,2008,引導式注意力障礙於初期帕金森氏症病患之研究,長庚大學護理學系研究所,碩士論文。
[23] 羅益祥,2009,轉換型模糊C-均值演算法應用於皮膚病與鳶尾花的分群,亞洲大學生物資訊學系,碩士論文。
[24] Abdulkadir Sengur., 2008, “An expert system based on principal component analysis, artificial immune system and fuzzy k-NN for diagnosis of valvular heart diseases,” Computers in Biology and Medicine, Vol. 38, pp. 329-338.
[25] A.Zanasi, N.F.F. Ebecken and C.A. Brebbia., 2009, “Data Mining X Data Mining, Protection, Detection and other Security Technologies.” WIT Press, pp. 51-59.
[26] Berry, M.J. and Linoff, G.S., 2000, Mastering data mining:The art and science of customer relationship management, New Jersey.
[27] Chien, C. and Chen, L., 2007, “Using rough set theory to recruit and retain high-potential talents for semiconductor manufacturing,” IEEE Transactions on Semiconductor Manufacturing, Vol. 20, pp. 528-541.
[28] Chien, C., Wang, W. and Cheng, J., 2007, “Data mining for yield enhancement in semiconductor manufacturing and an empirical study,” Expert Systems with Applications, Vol. 33, pp. 192-198.
[29] Christine, E., W-P Chen, Ke, N., and M-Y Su, 2009, “Prediction of Malignant Breast Lesions from MRI Features: A Comparison of Artificial Neural Network and Logistic Regression Techniques,” Academic Radiology, Vol.16, pp.842-851
[30] Chun-Lang Chang, Chih-Hao Chen, 2009, “Applying decision tree and neural network to increase quality of dermatologic diagnosis,” Expert Systems with Applications, Vol. 36, pp.4035–4041.
[31] C.W. Hsu, C.C. Chang, and C.J. Lin, 2003, “A Practical Guide to Support Vector Classification,” http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
[32] Dong J.J., Y.H. Tung, C.C. Chen, J.J. Liao, and Y.W. Pan, 2011, “Logistic regression model for predicting the failure probability of a landslide dam ,” Engineering Geology, Vol.117 pp.52-61.
[33] Fayyad, U.M., Shapiro, G. and Smyth, P., 2006, “From Data Mining to Knowledge Discovery: An Overview,” Advances in knowledge Discovery and Data Mining, WIT Press.
[34] Fei-Long Chen, Feng-Chia Li, 2010, “Combination of feature selection approaches with SVM in credit scoring,” Expert Systems with Applications, Vol. 37, pp.4902-4909.
[35] Handan, A.C., Ayse, C. Y., Zeki, A., Resul, B., and Mehmet, A. S., 2007, “Comparison of logistic regression model and classification tree: An application to postpartum depression data,” Expert Systems with Applications, Vol.32, pp. 987-994.
[36] Hsu, S. and Chien, C., 2007, “Hybrid data mining approach for pattern extraction from wafer bin map to improve yield in semiconductor manufacturing,” International Journal of Production Economics, Vol. 107, pp. 88-103.
[37] Ismail, H., Bilal, H. and Eyas, E., 2008, “Performance of KNN and SVM classifiers on full word Arabic articles,” Advanced Engineering Informatics, Vol. 22, pp.106-111.
[38] Julia, N., Christoph, S. and Gabriele, S., 2004, “SVM-based Feature Selection by Direct Objective Minimisation,” Proceedings of DAGM, Dept. of Mathematics and Science University of Mannheim.
[39] Kenneth, R. Florin, G. and Abdel, S., 2009, “Feature Selection in Parkinson’s Disease: A Rough Sets Approach,” Proceedings of the International Multiconference on Computer Science and Information Technology, pp. 425-428.
[40] Max A. Little., Patrick, E. and Eric, J., 2009 “Suitability of dysphonia measurements for telemonitoring of Parkinson's disease,” IEEE Transactions on Biomedical Engineering, Vol.56, pp.1051-1022.
[41] Nacereddine H., and Mouldi B., 2010 “Improved Tree Model for Arabic Speech Recognition,” IEEE, Vol. 56, pp.521-526.
[42] Peter, C. A., Jack, V. T., and Douglas, S. L., 2010, “Logistic regression had superior performance compared with regression trees for predicting in-hospital mortality in patients hospitalized with heart failure,” Journal of Clinical Epidemiology, Vol.63, pp. 1145-1155
[43] Philip K. Chen, Wei Fan, Andreas L. Prodromidis, and Salvatore J. Stolfo, 1999, “Distributed Data Mining in Credit Card Fraud Detection,” IEEE Intelligent Systems, Vol. 14, pp. 67-74.
[44] Qinbao Song, Martin Shepperd, Xiangru Chen, Jun Liu, 2008, “Can k-NN imputation improve the performance of Performance of C4.5 with small software project data sets? A comparative evaluation,” The Journal of Systems and Software, Vol. 81, pp. 2361-2370.
[45] Resul Das., 2009, “A comparison of multiple classification methods for diagnosis of Parkinson disease,” Expert Systems with Application, Vol. 37, pp. 1568-1572.
[46] Rong Li, Yanmei Cui, Han He, and Huaning Wang, 2008, “Application of support vector machine combined with K-nearest neighbors in solar flare and solar proton events forecasting,” Advanced in Space Research, Vol. 42, pp.1469-1474.
[47] Shih, J.Y., and Chen, W.H., 2006, “A study of taiwan’s issuer credit rating systems using support vector machines,” Expert Systems with Applications, Vol. 30, pp. 427-435.
[48] Sirpa Thessler, Steven Sesnie, Zayra S. Ramos Bendaña, and Kalle Ruokolainen, 2008, “Using k-nn and discriminant analyses to classify rain forest types in a Landsat TM image over northern Costa Rica,” Remote Sensing of Environment, Vol. 112, pp. 2485-2494.
[49] Snezana Dragovic and Antonije Onjia, 2008, “Classification of soil samples according to geographic origin using gamma-ray spectrometry and pattern recognition methods,” Applied Radiation and Isotopes, Vol. 65, pp. 218-224.
[50] Srinivasa K G , Venugopal K R and L M Patnaik, 2006, “Feature Extraction using Fuzzy C - Means Clustering for Data Mining Systems,” IJCSNS International Journal of Computer Science and Network Security, Vol. 6 No.3A, pp.230-236.
[51] Su, C.T., and Yang, C.H., 2008,“Feature selection for the SVM: An application to hypertension diagnosis,” Expert Systems with Applications, Vol.34, pp.754–763
[52] U beyli, E.D. and Gler, I., 2005, “Automatic detection of erythemato-squamous diseases using adaptive neuro-fuzzy inference systems,” Computers in Biology and Medicine, Vol. 35, pp. 421-433.
[53] Yiyu, Cheng, and Ke, Yu., 2006,”Discriminating the genuineness of Chinese medicines using least squares support vector machines,” Chinese Journal of Analytical Chemistry, Vol. 34, pp. 561-564.
[54] Zahari, A. B., Nooritawati, M. T. and Ihsan M. Y., 2010, “Classification of Parkinson’s Disease Based on Multilayer Perceptrons Neural Network,” 6th International Colloquium on Signal Processing & Its Applications, pp.1-4.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔