研究生(外文):Chang Chun Lien
論文名稱(外文):A data mining approach to prediction of breast cancer relapse
外文關鍵詞:Breast cancer relapseC4.5 decision treeSupport Vector MachineCommittee Machine
The incidence and mortality rate of breast cancer in Taiwanese Women have increased gradually due to the urban life style and westen style food.In the recent 5 years, the incidence of breast cancer in Taiwanese Women became the first in all cancers. The highest perioid of incidence of breast cancer is between 45 to 55 years old. In the early stage of breast cancer, it is almost asymptoatic and keep the patients from medical help.When the breast cancer was diagnosed, many of them aleady have lymph node metastasis. This situation also lifts the recurrent rate.
Due to the progress of information technology and medical information system, hospitals also have accumlated a large amount of data in the database of information systems. Therefore, much useful medical knowledge could be mined from the history data. The prediction of breast cancer relapse is very helpful for post-operative treatment and followup. The statistical methods had been applied to predict breast cancer relapse. However, this study employed data mining techciques, including C4.5 decision tree and SVM, to construct recurrence prediction models of breast cancer. To improve the prediction efficiency, this study also applied committee machine methods, including AdaBoost and Bagging, to increase the relapse prediction accuracy. The empirical results show that AdaBoost mechanism can ehance prognosis accuracy of C4.5 and SVM models on breast cancer relapse.
Keyword:Breast cancer relapse、C4.5 decision tree、Support Vector Machine、Committee Machine
摘 要 III
表目錄 VI
圖目錄 VII
第一章 緒論 1
第一節 研究背景 1
第二節 研究動機與目的 3
第三節 論文架構 4
第二章 文獻探討 5
第一節 乳癌復發的預後因子 5
第二節 分類分析技術 11
第三節 分類效能增強的委員會機器 15
第三章 資料搜集與評估方法 19
第一節 資料的蒐集與資料描述 19
第二節 模式建構流程 29
第四章 實證評估 31
第一節 利用病理組織切片資訊建立預測模式 31
第二節 利用FNA細胞影像資料建立預測模式 34
第三節 運用自動屬性挑選機制輔助預測模式的建構 37
第五章 結論與建議 40
第一節 結論 40
第二節 研究限制與建議 42
參考文獻 43
