跳到主要內容

臺灣博碩士論文加值系統

(44.192.247.184) 您好!臺灣時間:2023/02/06 16:45
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:余育昇
研究生(外文):Yu-Sheng Yu
論文名稱:藉由機器學習模型的組合預測加護病房重症患者的臨床結果
論文名稱(外文):Predicting Clinical Outcomes of Critically Ill Patients in Intensive Care Units with Combination of Machine Learning Models
指導教授:賴飛羆賴飛羆引用關係
指導教授(外文):Fei-Pei Lai
口試委員:葉育彰郭律成阮聖彰江岱倫
口試委員(外文):Yu-Chang YehLu-Cheng KuoShanq-Jang RuanDai-Lun Chiang
口試日期:2021-01-18
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:生醫電子與資訊學研究所
學門:工程學門
學類:生醫工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:英文
論文頁數:42
中文關鍵詞:重症加護病房預測機器學習死亡率住院天數
外文關鍵詞:intensive care unitpredictionmachine learningmortalitylength of stay
DOI:10.6342/NTU202100268
相關次數:
  • 被引用被引用:1
  • 點閱點閱:134
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
背景:儘管已經從大量患者中收集數據進行統計分析開發出各種預測評分系統了,但是預測加護病房患者的臨床預後仍然是一個重要而艱鉅的挑戰。近年來,隨著機器學習技術的發展,各種演算法都提供了更強大的模型推斷能力並且已被用於分析此類數據。

目的:本研究旨在通過結合三種機器學習模型,採用三步策略來提高重症患者對於死亡率預測的靈敏度和精確率,並將入住ICU的住院天數分為四類,以使用分類模型來預測結果而非使用典型的回歸模型。

方法:從NTUH CORE資料庫中提取了4,228名重症成人患者。在死亡率模型中,資料經過前處理後使用完整資料及平衡資料經由七種機器學習算法訓練。選擇了三種具有最高靈敏度,中等精確率和最高精確率的模型。在住院天數分類模型中,我們分析了住院天數在整個資料中的分佈,並將天數分為四種類別進行標記。然後,通過機器學習的多類預測,獲得這四種類別的結果和相應的預測概率。

結果:在死亡率模型中,使用最高靈敏度模型將測試資料集中843名患者中的588名分類為低度風險組,死亡率為2.6%(95%CI,1.4至4.2%),其他255名患者則往下進行下一步的預測。經過中等精確率和最高精確率模型進行處理之後,這255名患者被進一步分為具有死亡率的中度風險組(210名患者),中高度風險組(26名患者)和調整後的高風險組(19名患者),死亡率分別為29%(95%CI,23至35.7%)、73.1%(95%CI,52.2至88.4%)和94.7%(95%CI,74至99.9%)。在住院天數分類模型中,F1-score為0.604,並且住院天數小於7天的患者的比大於7天的有更好的表現。

結論:這項研究表明,通過結合最高靈敏度,中等精確率和最高精確率三種模型,三步策略過程提高了重症患者30天死亡率的可預測性。
Background: Although different types of predictive scoring system have been developed from the statistical analysis of data collected for a large number of patients, prediction of clinical outcome for patients in intensive care units still remains an important and difficult challenge. In recent years, with the development of machine learning technology, various algorithms have provided more powerful model inference capabilities and has been used to analyze such data.

Objective: This study aimed to use a three-step strategy to improve the sensitivity and precision in mortality prediction for critically ill patients by combining three machine learning models, and divide ICU length of stays into four classes for using the classification model to predict outcome instead of the regression model.

Method: A total of 4,228 adult intensive care patients were extract from NTUH CORE database. In mortality model, the data is trained through seven machine learning algorithms with whole data and balanced data after data preprocessing. Three models were selected with the abilities of the highest sensitivity, moderate precision, and the highest precision, respectively. In LOS classification model, we analyze the distribution of LOS in the whole data and divide days into four classes for labeling. Then, through the multi-class prediction of machine learning, the results of the four classes and the corresponding probabilities are obtained.

Result: In mortality model, 588 of the 843 patients in the testing dataset were classified into the low risk group with a mortality rate of 2.6% (95% CI, 1.4 to 4.2%) by using the highest sensitivity model, and other 255 patients went through the next step of prediction. After processing with moderate precision and the highest precision models, these 255 patients were further classified into the moderate risk group (210 patients), high-moderate risk group (26 patients), and adjusted high risk group (19 patients) with a mortality rate of 29% (95% CI, 23 to 35.7%), 73.1% (95% CI, 52.2 to 88.4%), and 94.7% (95% CI, 74 to 99.9%), respectively. The weighted average F1-score was 0.604 In LOS classification model, and the proportion of patients with LOS less than 7 days has better performance than those with LOS more than 7 days.

Conclusion: This study revealed that a three-step strategy process enhanced the predictability of 30-day mortality of critically ill patients by combination of the highest sensitivity, moderate precision, and the highest precision models.
口試委員會審定書 #
誌謝 i
中文摘要 ii
ABSTRACT iv
CONTENTS vi
LIST OF FIGURES viii
LIST OF TABLES ix
Chapter1. Introduction 1
1.1 Background 1
1.2 Related works 2
1.3 Objective 2
Chapter2. Architecture 4
2.1 Workflow 4
2.2 Three-step strategy prediction 5
2.3 Four-class prediction on LOS classification model 6
Chapter3. Methods 9
3.1 Data source 9
3.2 Patient Selection 10
3.3 Data distribution of physiological data 13
3.4 Data distribution of ICU length of stay 15
3.5 Feature Selection and Feature Engineering 16
3.6 Imbalanced Data 18
3.7 Missing Data 19
3.8 K-Fold Validation 20
3.9 Classification Model 21
3.9.1 Logistic Regression 21
3.9.2 K-nearest neighbors 22
3.9.3 Decision Tree 22
3.9.4 Random Forest 23
3.9.5 Linear Discriminant Analysis 24
3.9.6 AdaBoost 24
3.9.7 XGBoost 25
3.10 Model Assessment 26
Chapter4. Results 27
4.1 Characteristics of input data 27
4.2 Mortality model 27
4.2.1 Influence of sample ratio 27
4.2.2 Model selection 28
4.2.3 Three-step strategy prediction 30
4.2.4 Feature Importance 32
4.3 LOS classification model 34
Chapter5. Discussion 35
5.1 Mortality model 35
5.2 LOS classification model 36
Chapter6. Limitation 38
Chapter7. Conclusions and future work 39
REFERENCE 40
1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25(1):44-56.
2. Benjamens S, Dhunnoo P, Mesko B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med 2020;3:118.
3. Calvert, J., Mao, Q., Hoffman, J. L., Jay, M., Desautels, T., Mohamadlou, H., ... & Das, R. (2016). Using electronic health record collected clinical variables to predict medical intensive care unit mortality. Annals of medicine and surgery, 11, 52-57.
4. Nanayakkara, S., Fogarty, S., Tremeer, M., Ross, K., Richards, B., Bergmeir, C., ... & Kaye, D. M. (2018). Characterising risk of in-hospital mortality following cardiac arrest using machine learning: A retrospective international registry study. PLoS medicine, 15(11), e1002709.
5. Pirracchio, R., Petersen, M. L., Carone, M., Rigon, M. R., Chevret, S., & van der Laan, M. J. (2015). Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. The Lancet Respiratory Medicine, 3(1), 42-52.
6. He HB, Garcia EA. Learning from Imbalanced Data. Ieee T Knowl Data En 2009;21(9):1263-1284.
7. Roumani YF, May JH, Strum DP, et al. Classifying highly imbalanced ICU data. Health Care Manag Sc 2013;16(2):119-128.
8. Sun YM, Wong AKC, Kamel MS. Classification of Imbalanced Data: A Review. Int J Pattern Recogn 2009;23(4):687-719.
9. Kim NJ, Bang JH, Choi JY, et al. The 2018 Clinical Guidelines for the Diagnosis and Treatment of HIV/AIDS in HIV-Infected Koreans. Infect Chemother 2019;51(1):77-88.
10. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 2011;12:2825-2830.
11. Brown LD, Cai TT, DasGupta A, et al. Interval estimation for a binomial proportion. Stat Sci 2001;16(2):101-133.
12. Kohn M, Senyak J. Sample Size Calculators Confidence interval for a proportion. UCSF CTSI Available at https://www.sample-size.net.: Accessed December 5, 2020.
13. Stow PJ, Hart GK, Higlett T, et al. Development and implementation of a high-quality clinical database: the Australian and New Zealand intensive care society adult patient database. J Crit Care 2006;21(2):133-141.
14. Knaus WA, Draper EA, Wagner DP, et al. Apache-Ii - a Severity of Disease Classification-System. Crit Care Med 1985;13(10):818-829.
15. Zimmerman JE, Kramer AA, McNair DS, et al. Acute physiology and chronic health evaluation (APACHE) IV: Hospital mortality assessment for today's critically ill patients. Crit Care Med 2006;34(5):1297-1310.
16. Gunn PP, Fremont AM, Bottrell M, et al. The Health Insurance Portability and Accountability Act Privacy Rule: a practical guide for researchers. Med Care 2004;42(4):321-327.
17. Collins GS, Reitsma JB, Altman DG, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015;162(1):55-63.
18. Leisman DE, Harhay MO, Lederer DJ, et al. Development and Reporting of Prediction Models: Guidance for Authors From Editors of Respiratory, Sleep, and Critical Care Journals. Crit Care Med 2020;48(5):623-633.
19. Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 2002;16:321-357.
20. He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). IEEE.
21. Troyanskaya O, Cantor M, Sherlock G, et al. Missing value estimation methods for DNA microarrays. Bioinformatics 2001;17(6):520-525.
22. Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3), 175-185.
23. Sharma, H., & Kumar, S. (2016). A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research (IJSR), 5(4), 2094-2097.
24. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
25. Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences 1997;55(1):119-139.
26. Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊