跳到主要內容

臺灣博碩士論文加值系統

(44.200.194.255) 您好!臺灣時間:2024/07/20 14:59
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:廖柏昱
研究生(外文):LIAO, BO-YU
論文名稱:基於機器學習使用集成算法預測心血管疾病
論文名稱(外文):Prediction of cardiovascular disease using ensemble algorithms based on machine learning
指導教授:邱垂昱邱垂昱引用關係
指導教授(外文):CHIU, CHUI-YU
口試委員:鄭辰仰施柏州邱垂昱
口試委員(外文):CHENG, CHEN-YANGSHIH, PO-CHOUCHIU, CHUI-YU
口試日期:2023-06-20
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:工業工程與管理系
學門:工程學門
學類:工業工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:中文
論文頁數:50
中文關鍵詞:心血管疾病機器學習預測分類算法超參數調整集成方法
外文關鍵詞:Cardiovascular diseaseMachine earningPrediction accuracyClassification algorithmsFeature extractionEnsemble methods
相關次數:
  • 被引用被引用:0
  • 點閱點閱:161
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
心臟病是現今全球主要的死因之一,包括冠狀動脈疾病、心肌梗塞和心衰竭等。近年來,隨著人口老化和不健康的生活方式的普及,心臟病的發病率不斷攀升,並且在亞洲地區的死亡率更是高居不下。預防和早期檢測心臟病變得至關重要。然而,由於心臟病的發病機制復雜,傳統的統計方法和經驗判斷往往難以獲得精確的預測結果。因此,利用機器學習技術進行心臟病的預測提供了新的方法。傳統的心臟病預測方法存在一定的局限性,無法充分考慮多種因素對心臟病的影響。而機器學習技術可以利用大量的生理數據和臨床數據,擁有更高的預測精度。因此,我們希望運用機器學習技術來建立一個可靠的心臟病預測模型,以提高心臟病的預測準確度,幫助醫生及時發現和治療心臟病,提高患者的生存率和治療效果。
臨床醫生最具挑戰性的任務之一是儘早發現心血管疾病的症狀。全世界每年都有許多人死於心血管疾病。影響健康的多個變量,例如血壓過高、膽固醇升高、脈搏率不規則等等,使得診斷心髒病具有挑戰性。然而人工智能可用於早期識別和治療疾病。本研究提出了一種基於集成的方法,該方法使用機器學習 (ML) 和集成式學習 (EL) 模型來預測一個人患心血管疾病的可能性,採用六種分類算法來預測心血管疾病。使用公開的心血管疾病病例數據集訓練模型,並使用隨機森林 (RF) 來提取重要的心血管疾病特徵,同時透過超參數調整確保各模型間進行比較的公平性。最後利用集成式學習模型進行預測,藉由績效指標驗證本研究模型效能。實驗結果表明,ML 集成模型達到了 99.47% 的最佳疾病預測準確率,證明了所提出的模型對心髒病預測是有效的。

Heart disease is one of the leading causes of death worldwide today, including coronary artery disease, myocardial infarction, and heart failure. In recent years, with the aging of the population and the popularity of unhealthy lifestyles, the incidence of heart disease has continued to rise, and the death rate in Asia has remained high. Prevention and early detection of heart disease has become crucial. However, due to the complex pathogenesis of heart disease, traditional statistical methods and empirical judgments are often difficult to obtain accurate prediction results. Therefore, the prediction of heart disease using machine learning technology provides a new method.Traditional heart disease prediction methods have certain limitations and cannot fully consider the impact of various factors on heart disease. And machine learning technology can use a large amount of physiological data and clinical data, with higher prediction accuracy. Therefore, we hope to use machine learning technology to establish a reliable heart disease prediction model to improve the prediction accuracy of heart disease, help doctors detect and treat heart disease in time, and improve the survival rate and treatment effect of patients.
One of the most challenging tasks for clinicians is the early detection of symptoms of cardiovascular disease. Every year, many people die from cardiovascular diseases all over the world. Multiple variables that affect health, such as high blood pressure, elevated cholesterol, irregular pulse rate, and more, make diagnosing heart disease challenging. Yet artificial intelligence can be used to identify and treat diseases early. This study presents an ensemble-based approach that uses machine learning (ML) and deep learning (DL) models to predict the likelihood of a person suffering from cardiovascular disease, employing seven classification algorithms to predict cardiovascular disease. Train a model using a publicly available dataset of cardiovascular disease cases and use Random Forest (RF) to extract important cardiovascular disease features. Experimental results show that the ML ensemble model achieves the best disease prediction accuracy of 97.07%, the proposed model is demonstrated to be effective for heart disease prediction.

中文摘要 i
英文摘要 iii
致謝 v
目錄 vi
表目錄 viii
圖目錄 ix
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 研究範圍與限制 2
1.4 研究流程 3
第二章 文獻探討 5
2.1 心血管疾病 5
2.1.1 常見的心血管疾病種類與定義 6
2.1.2 心血管疾病風險因子之研究 7
2.1.3 醫療人員心臟疾病判斷 8
2.2 人工神經網路分類 9
2.3 資料不平衡處理 12
2.4 分類模型 15
2.4.1 邏輯斯回歸(Logistic Regreesion) 15
2.4.2 隨機森林(Random Forest) 16
2.4.3 極限梯度提升(Extreme Gradient Boosting) 17
2.4.4 K最鄰近(KNN) 17
2.4.5 決策樹(Decision Tree) 17
2.4.6 支援向量機(SVM) 18
2.5 集成式學習方法的應用 18
2.6 相關領域文獻 18
2.7 小結 20
第三章 研究方法 21
3.1 研究架構 21
3.2 資料來源與特徵屬性 22
3.3 資料預處理 23
3.4 特徵選擇 24
3.5 超參數調整 25
3.6 集成分類器 26
3.7 模型驗證評估 30
第四章 研究結果與分析 33
4.1 環境設定 33
4.2 數據樣本處理與模型訓練 34
4.3 實驗結果 37
4.4 對比分析 42
第五章 結論與建議 44
5.1 研究結論 44
5.2 未來方向與建議 45
參考文獻 46


1.World Health Organization, Cardiovascular diseases, Fact sheet, Reviewed September 2016.
2.衛生福利部疾病管制署:110年國人死因統計結果。https://www.mohw.gov. tw/cp-16-70314-1.html
3.World Health Organization, Noncommunicable diseases. Available at: https://www.wh-o.int/newsroom/fact-sheets/detail/noncommunicable-diseases
4.K. V. Sabarish and T. S. Parvati, “An experimental investigation on l9 orthogonal array with various concrete materials,” Materials Today Proceedings, vol. 37, pp. 3045–3050,
5.2021
6.University of Washington Medical Center, Regional Heart Center, Heart Disease Patient Education,Available at:https://healthlibrary.uwmedicine.org/Library/DiseasesCondition-s/Adult/Cardiovascular/85,P00196
7.University of Washington Medical Center, Regional Heart Center, Heart Disease Patient Education,P1-18.http://www.uwmedicine.org/services/cardiology/Documents/Congeni-talHeartConditions.pdf
8.Centers for Disease Control and Prevention, Heart Disease and Stroke. Available at: https://www.cdc.gov/chronicdisease/resources/publications/factsheets/heartdiseasestroke.htm
9.Mendis S., Puska P., and Norrving B., World Heart Federation, Cardiovascular disease risk factors., Global Atlas on Cardiovascular Disease Prevention and Control., World Health Organization, Geneva 2011.
10..Ongsulee, P. (2017, November). Artificial intelligence, machine learning and deep learning. In 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE) (pp. 1-6). IEEE.
11.Zhang, M. L., & Zhou, Z. H. (2013). A review on multi-label learning algorithms. IEEE transactions on knowledge and data engineering, 26(8), 1819-1837.
12.B. Mahesh, “Machine learning algorithms-a review,” International Journal of Science and Research, vol. 9, pp. 381–386,2020.
13.Van Engelen, J. E., & Hoos, H. H. (2020). A survey on semi-supervised learning. Machine Learning, 109(2), 373-440.
14.Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.
15.P. Rani, R. Kumar, N. M. Ahmed, and A. Jain, “A decisionsupport system for heart disease prediction based upon machine learning,” Journal of Reliable Intelligent Environments, vol. 7, 2021.
16.Galar, D.; Gustafson, A.; Tormos Martínez, BV.; Berges, L. (2012). Maintenance Decision Making based on different types of data fusion. Eksploatacja i Niezawodnosc - Maintenance and Reliability. 14(2):135-144. http://hdl.handle.net/10251/87630
17.Guo Haixiang, Li Yijing, Jennifer Shang, Gu Mingyun, Huang Yuanyue, Gong Bing.(2017).Learning from class-imbalanced data: Review of methods and applications,Expert Systems with Applications,Volume 73 ,Pages 220-239,ISSN 0957-4174,https://doi.org/10.1016/j.eswa.2016.12.035.
18.He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263-1284.
19.Youssef, M., Mohammed, S., Hamada, E. K., & Wafaa, B. F. (2019). A predictive approach based on efficient feature selection and learning algorithms’ competition: Case of learners’ dropout in MOOCs. Education and Information Technologies, 24, 3591-3618.
20.Guo, H. Y., Zhang, L., Zhang, L. L., & Zhou, J. X. (2004). Optimal placement of sensors for structural health monitoring using improved genetic algorithms. Smart materials and structures, 13(3), 528.
21.Weng, B., Lu, L., Wang, X., Megahed, F. M., & Martinez, W. (2018). Predicting short-term stock prices using ensemble methods and online data sources. Expert Systems with Applications, 112, 258-273
22.Polikar, R. (2012). Ensemble Learning. In: Zhang, C., Ma, Y. (eds) Ensemble Machine Learning. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9326-7_1
23.Dong, X., Yu, Z., Cao, W. et al. A survey on ensemble learning. Front. Comput. Sci. 14, 241–258 (2020). https://doi.org/10.1007/s11704-019-8208-z
24.Sagi O, Rokach L. Ensemble learning: A survey. WIREs Data Mining Knowl Discov. 2018;8:e1249. https://doi.org/10.1002/widm.1249
25.Latha, C. B. C., & Jeeva, S. C. (2019). Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Informatics in Medicine Unlocked, 16, 100203.
26.陳志華,王麗敏, 張梅, 趙振平, 黃正京, 張笑, ... & 週脈耕. (2019). 中國老年人群慢性病患病狀況和疾病負擔研究. 中華流行病學雜誌, 40(3), 277-283.
27.Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., & Hamprecht, F. A. (2009). A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC bioinformatics, 10, 1-16.
28.Abdulhamit Subasi, Ergun Erçelebi, (2015),"Classification of EEG signals using neural network and logistic regression",Computer Methods and Programs in Biomedicine,Volume 78, Issue 2,87-99.
29.L. Breiman (2001). Random Froest. Machine learning 45:5-32.
30.E.K. Sahin, “Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using xgboost, gradient boosting machine, and random forest,” SN Applied Sciences, vol.2, no.7, p.1308, 2020.
31.S. B. Imandoust and M. Bolandraftar, “Application of k-nearest neighbor (knn) approach for predicting economic events: theoretical background,” International Journal of Engineering Research in Africa, vol.3, no.5, pp. 605–610,2013.
32.Song YY, Lu Y. Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry. 2015 Apr 25;27(2):130-5. doi: 10.11919/j.issn.1002-0829.215044. PMID: 26120265; PMCID: PMC4466856..
33.Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
34.A. Abbasi, A. R. Javed, C. Chakraborty, J. Nebhen, W. Zehra, and Z. Jalil, “Elstream: an ensemble learning approach for concept drift detection in dynamic social big data stream learning,” IEEE Access, vol.9, pp. 66408–66419, 2021.
35.W. Zehra, A. R. Javed, Z. Jalil, H. U. Khan, and T. R. Gadekallu, “Cross corpus multi-lingual speech emotion recognition using ensemble learning,” Complex & Intelligent Systems, vol.7, pp. 1–10, 2021.
36.X. Dong, Z. Yu, W. Cao, Y. Shi, and Q. Ma, “A survey on ensemble learning,” Frontiers of Computer Science, vol.14, no.2, pp. 241–258, 2020.
37.Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232.[CrossRef]
38.張尚宏 (1999). 當代心臟學:診斷與治療, 合記圖書出版社.
39.Crawford, M. and 張尚宏編譯 (1999). 當代心臟學:診斷與治療, 合記圖書出版社.
40.Ramesh, J., Aburukba, R., Sagahyroon, A.: “A remote healthcare monitoring framework for diabetes prediction using machine learning”. Healthc. Technol. Lett. 8, Page No:45–57, 2021.
41.Introduction to AdaBoost for Absolute Beginners,https://www.analyticsvidhya.com/blog/2022/01/introduction-to-adaboost-for-absolute-beginners/,造訪日期:2023.03
42.了解 AUC - ROC 曲線,https://www.keywordseo.com.tw/blog1/understanding-auc-roc-curve/,造訪日期:2023.03

電子全文 電子全文(網際網路公開日期:20260801)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊