跳到主要內容

臺灣博碩士論文加值系統

(44.200.86.95) 您好!臺灣時間:2024/05/30 04:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:何維皓
研究生(外文):Wei-Hao He
論文名稱:將機器學習與Cox比例風險模型應用於肝癌存活分析
論文名稱(外文):Survival analysis of liver cancer patients by using the Cox proportional hazards model and an XGBoost machining method
指導教授:嚴成文
指導教授(外文):Yen,Chen-wen
學位類別:碩士
校院名稱:國立中山大學
系所名稱:機械與機電工程學系研究所
學門:工程學門
學類:機械工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:110
中文關鍵詞:肝癌生存分析Cox機器學習XGBoost
外文關鍵詞:Hepatocellular CarcinomaSurvival analysisCoxMachine learningXGBoost
相關次數:
  • 被引用被引用:0
  • 點閱點閱:152
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
肝癌(Hepatocellular Carcinoma, HCC),是發病始於肝臟或由其他器官擴散至肝臟的癌症。肝癌為最常見的癌症之一,並已成為全球與癌症死亡相關的前十大主因。
在制定肝癌治療方案時我們需要考量許多因素,包括肝癌分期、腫瘤生長的大小、數量與生長位置、以及是否存在肝硬化、血管侵犯或肝臟外轉移等情況。
為了應對這一挑戰,本研究的目的是為肝癌生存分析開發兩個決策支持系統。
本研究中開發的第一個系統是基於經典的Cox比例風險模型,第二個系統是透過極限梯度提升(XGBoost)算法訓練的機器學習模型。與Cox模型相比,機器學習模型的明顯優勢是其可以預測個別案例的生存時間。因此,機器學習可以通過比較使用不同療法的患者族群所預測出的生存時間來進行個人化醫療預測。
用於開發以上兩個系統的資料集中共有8502個病例樣本,且每個樣本皆具有33個人口統計學和臨床特徵。Cox模型與機器學習模型的一致性指數值分別為0.803與0.814。根據本文的各項分析與結果評估,可以發現機器學習模型在絕對預測上有一定的難度,但在相對預測上卻有不錯的表現,尤其一致性指數高達0.814,這也有助於比較醫療手段之間的療效差異,以此協助患者選擇較為理想的醫療方案。
Hepatocellular Carcinoma (HCC) is cancer that starts in the liver itself or spreads to the liver from other organs. HCC is also the most common primary liver cancer and has been one of the top ten leading causes of global cancer-related mortality.

Developing a treatment plan for HCC requires us to consider many factors, including cancer stage, the growth rate, size, location and number of tumors and whether there are liver cirrhosis, vascular invasion or extrahepatic metastasis, etc. To tackle this challenge, the goal of this study is to develop two decision support systems for HCC survival analysis.

The first system developed in this work is based on the classical Cox proportional hazard (PH) model. The second system is a machine learning model trained by extreme gradient boosting (XGBosst) algorithm. Compared to the Cox PH model, a distinct advantage of the machine learning model is it can directly predict the survival time for individual cases. As a result, it can perform personalized treatment predictions by comparing the survival time associated with different treatment methods.

The dataset used to develop these two systems consists of 8502 cases. Each of these cases is characterized by 31 demographic and clinical features. The value of the concordance index for the Cox PH model and the machine learning model is 0.803 and 0.814, respectively. According to the analysis and result evaluation of this research, it can be found that the machine learning model has a certain degree of difficulty in absolute prediction, but it has a good performance in relative prediction, especially the concordance index is as high as 0.814, which is also helpful for comparing medical methods. The difference in curative effect between the two, in order to help patients choose a more ideal medical plan.
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
目錄 vi
圖目錄 x
表目錄 xiii
第一章 緒論 1
1.1前言 1
1.2研究動機與目的 2
1.3文獻回顧 3
1.4論文架構 4
第二章 相關研究背景 5
2.1生存分析 5
2.2Cox比例風險模型 6
2.3機器學習 7
2.4集成學習(Ensemble learning) 7
2.5 XGBoost(Extreme Gradient Boosting) 9
2.6生存分析加速失效模型 12
第三章 資料集與機器學習技術優化方法 13
3.1資料集 13
3.1.1資料集說明 13
3.1.2資料集的建立及基礎訓練流程 16
3.2機器學習軟體和硬體的應用 17
3.2.1機器學習硬體設備 17
3.2.2機器學習軟體架構 18
3.3機器學習架構及效能改善方法 19
3.3.1缺失值處理 19
3.3.2一位有效編碼(One Hot Encoding) 21
3.3.3 K折交叉驗證(K-fold Cross Validation) 22
3.3.4超參數設定 23
3.3.5特徵篩選 26
3.3.6存活分析方法 27
3.3.7機器學習優化流程 30
第四章Cox比例風險模型之實驗結果與討論 31
4.1資料集分析 31
4.1.1樣本特徵統計分析 31
4.1.2存活時間資訊與分布 34
4.1.3資料預處理 35
4.2效能評估及分析指標 36
4.2.1一致性分析指標-Concordance index 36
4.2.2統計估計-區間估計 37
4.2.3風險比例-Hazard Ratio 37
4.2.4存活曲線 38
4.3分析結果與討論 39
4.3.1一致性分析 39
4.3.2特徵顯著性 40
4.3.3特徵分析-HR值 41
4.3.4特徵分析-存活曲線 43
4.3.5 Log-rank test 52
第五章XGBoost存活回歸模型之實驗結果與討論 53
5.1 效能指標 53
5.1.1 基本分類效能指標 53
5.2 肝癌資料整體樣本之存活回歸模型結果分析與討論 56
5.2.1特徵重要性 56
5.2.2預測結果一致性分析 58
5.2.3 K群樣本預測誤差分析 59
5.2.4 U群樣本預測成功率分析 60
5.2.5 Kaplan-Meier存活曲線 61
5.2.6不同時間閥值之存活分析分類結果 63
5.3各醫療手段樣本之存活回歸模型結果分析與討論 65
5.3.1各醫療手段樣本預測結果一致性分析 65
5.3.2各醫療手段K群樣本預測誤差分析 66
5.3.3各醫療手段U群樣本預測成功率分析 68
5.3.4各醫療手段樣本之Kaplan-Meier存活曲線 69
5.3.5各醫療手段樣本不同時間閥值之存活分析分類結果 75
5.3.6改變醫療手段之輸出值結果分析 80
5.4肝癌各分期樣本之存活回歸模型結果分析與討論 87
5.4.1肝癌各分期樣本預測結果一致性分析 87
5.4.2肝癌各分期K群樣本預測誤差分析 88
5.4.3肝癌各分期U群樣本預測成功率分析 89
5.4.3肝癌各分期樣本之Kaplan-Meier存活曲線 90
第六章 討論與未來展望 92
參考文獻 93
[1]衛生福利部統計處,108年死因統計結果分析(2020)。
https://dep.mohw.gov.tw/DOS/lp-4927-113.html
[2]Faber, W., Sharafi, S., Stockmann, M., Denecke, T., Sinn, B., Puhl, G., Bahra, M., Malinowski, M. B., Neuhaus, P., & Seehofer, D. (2013). Long-term results of liver resection for hepatocellular carcinoma in noncirrhotic liver. Surgery, 153(4), 510–517.
[3]Lee Cheah, Y., & K H Chow, P. (2012). Liver transplantation for hepatocellular carcinoma: an appraisal of current controversies. Liver cancer, 1(3-4), 183–189.
[4]Chan S. C. (2013). Liver transplantation for hepatocellular carcinoma. Liver cancer, 2(3-4), 338–344.
[5]Chan, S. C., Sharr, W. W., Chan, A. C., Chok, K. S., & Lo, C. M. (2013). Rescue Living-donor Liver Transplantation for Liver Failure Following Hepatectomy for Hepatocellular Carcinoma. Liver cancer, 2(3-4), 332–337.
[6]Belghiti, J., & Fuks, D. (2012). Liver resection and transplantation in hepatocellular carcinoma. Liver cancer, 1(2), 71–82.
[7]Mazzaferro, V., Bhoori, S., Sposito, C., Bongini, M., Langer, M., Miceli, R., & Mariani, L. (2011). Milan criteria in liver transplantation for hepatocellular carcinoma: an evidence-based analysis of 15 years of experience. Liver transplantation : official publication of the American Association for the Study of Liver Diseases and the International Liver Transplantation Society, 17 Suppl 2, S44–S57.
[8]Lin S. M. (2013). Local ablation for hepatocellular carcinoma in taiwan. Liver cancer, 2(2), 73–83.
[9]Lencioni R. (2010). Loco-regional treatment of hepatocellular carcinoma. Hepatology (Baltimore, Md.), 52(2), 762–773.
[10]Llovet, J. M., & Bruix, J. (2003). Systematic review of randomized trials for unresectable hepatocellular carcinoma: Chemoembolization improves survival. Hepatology (Baltimore, Md.), 37(2), 429–442.
[11]Lencioni R. (2012). Chemoembolization in patients with hepatocellular carcinoma. Liver cancer, 1(1), 41–50.
[12]Llovet, J. M., Real, M. I., Montaña, X., Planas, R., Coll, S., Aponte, J., Ayuso, C., Sala, M., Muchart, J., Solà, R., Rodés, J., Bruix, J., & Barcelona Liver Cancer Group (2002). Arterial embolisation or chemoembolisation versus symptomatic treatment in patients with unresectable hepatocellular carcinoma: a randomised controlled trial. Lancet (London, England), 359(9319), 1734–1739.
[13]Mazzaferro, V., Regalia, E., Doci, R., Andreola, S., Pulvirenti, A., Bozzetti, F., Montalto, F., Ammatuna, M., Morabito, A., & Gennari, L. (1996). Liver transplantation for the treatment of small hepatocellular carcinomas in patients with cirrhosis. The New England journal of medicine, 334(11), 693–699.
[14]Kleinbaum, D.G., Klein, M. (2012). The Cox Proportional Hazards Model and Its Characteristics. Survival Analysis.
[15]Solomatine, D. P., & Shrestha, D. L. (2004). AdaBoost. RT: a boosting algorithm for regression problems. In Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on (Vol. 2, pp. 1163-1168). IEEE.
[16]Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
[17]Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
[18]boosting algorithm., Machine Learning:Proceedings of the Thirteenth International Conference Morgan. Kauffman, San Francisco. pp.148-156.
[19]Breiman, L. (1996). Stacked regressions. Machine learning, 24(1), 49-64.
[20]92. Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.
[21]Barnwal, Avinash, Cho, Hyunsu, Dylan Hocking, Toby. Survival regression with accelerated failure time model in XGBoost. arXiv:2006.04920 . (2020)
電子全文 電子全文(網際網路公開日期:20240714)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊