跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.152) 您好!臺灣時間:2025/11/02 00:54
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:黃婷萱
研究生(外文):HUANG, TING-XUAN
論文名稱:結合資料探勘技術進行糖尿病與乳癌關聯性分析模式
論文名稱(外文):A Hybrid Data Mining Model for Analyzing the Association between Diabetes and Breast Cancer.
指導教授:李天行李天行引用關係
指導教授(外文):LEE, TIAN-SHYUG
口試委員:呂奇傑陳麒文
口試委員(外文):LU, CHI-JIE
口試日期:2016-06-22
學位類別:碩士
校院名稱:輔仁大學
系所名稱:企業管理學系管理學碩士班
學門:商業及管理學門
學類:企業管理學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:中文
論文頁數:36
中文關鍵詞:資料探勘糖尿病疾病危險因子乳癌
外文關鍵詞:data miningdiabetes mellitusdisease risk factorbreast cancer
相關次數:
  • 被引用被引用:1
  • 點閱點閱:289
  • 評分評分:
  • 下載下載:11
  • 收藏至我的研究室書目清單書目收藏:0
糖尿病為現今醫學上無法治癒之慢性疾病,其併發症引發死亡人口數,逐年提升,而醫療費用已成為全民健保的龐大負擔,亦對患者心理方面帶來傷害。近年來糖尿病與癌症的關聯性,於學術界中廣為探討,其中,乳癌為台灣地區女性癌症發生率第一名。因此,本研究的目的為探討,女性糖尿病患者罹患乳癌之潛在危險因子,並透過資料探勘技術建構預測疾病危險因子分析模式。方法主要結合資料探勘技術進行回溯性世代研究(retrospective cohort study),研究對象為全民健保資料庫2005年至2012年間患有糖尿病者,分析其於未來兩年內罹患乳癌之潛在危險因子。分析模式中,首先,採用集群減少多數抽樣技術(under sampling based on clustering, SBC),解決龐大資料庫類別不平衡之問題,接著,透過分類回歸樹(classification and regression trees, CART),分類出可能影響之潛在危險因子。結果得知,當女性糖尿病患者,患有「糖尿病所致多發神經病變」或「併有末梢血管循環疾患之糖尿病」時,其乳癌發生率與勝算比顯著較高。本研究希冀此結果可提供相關資訊於醫學方面上作為參考,進一步減輕健保負擔。而所提之分析模式,可降低類別不平衡的影響且有效發現潛在疾病危險因子,並期望運用於不同疾病上。
Diabetes is a chronic disease which cannot be cured by medical technology nowadays, it death population created by complications of diabetes increasing year by year, and breast cancer brings huge medical expenses, and it becomes the burden of the National Health Insurance. The relevance between diabetes and cancer is a well-known issue in recent years, among all the cancer, the incidence of breast cancer is the highest in Taiwanese female. Therefore, the purpose of this study is applying data mining techniques to retrospective cohort study the association between diabetes and breast cancer. The proposed disease risk factor analysis model combines under sampling based on clustering (SBC), and classification and regression trees (CART) to construct a disease prediction model. Analysis the databases of national health insurance to explore disease risk factors affecting diabetic patients without breast cancer start dialysis treatment in next two years. Experimental results showed that female patients suffers “diabetes neuropathy” or “Diabetes mellitus with peripheral circulatory disorder”, that it prevalence rate and incidence rate was significantly higher. With this model, it can reduce the effect of big data's class imbalance problem and finding the potential disease risk. The proposed model also can use in different disease and alleviate the burden of National Health Insurance.
目 錄
頁次
第壹章 緒論 1
第一節 研究背景與動機 1
第二節 研究目的 4
第三節 研究流程 4
第貳章 文獻探討 6
第一節 疾病預測 6
第二節 資料探勘 7
第三節 糖尿病類型 9
第四節 糖尿病與乳癌 10
第參章 研究方法 12
第一節 預測變數 12
第二節 研究架構 13
第三節 集群減少多數抽樣 14
第四節 分類回歸樹 15
第五節 疾病危險因子分析模式 15
第肆章 實證研究 22
第一節 實證資料描述 22
第二節 疾病危險因子分析結果 23
頁次
第伍章 結論與建議 29
第一節 結論 29
第二節 未來研究建議 30
參考文獻 31


表 目 錄
頁次
表1-1-1 台灣地區十大死因 2
表3-5-1 乳房手術健保申報醫令 16
表3-5-2 目標變數定義表 17
表3-5-3 心(腦)血管併發症之ICD-9-CM代碼 18
表3-5-4 神經併發症之ICD-9-CM代碼 19
表3-5-5 周邊血管併發症之ICD-9-CM代碼 19
表3-5-6 風險測量矩陣 20
表4-2-1 華德法之分析結果 23
表4-2-2 K-means分群結果 23
表4-2-3 SBC於不同比例之抽樣集數目摘要表 24
表4-2-4 不同比例之抽樣集整體平均正確率 28
表4-2-5 危險因子風險評估 28


圖 目 錄
頁次
圖1-3-1 研究流程圖 5
圖2-4-1 癌症死亡之風險比 13
圖3-2-1 疾病危險因子分析模式架構圖 13
圖4-1-1 目標集群比例分布圓餅圖 22

圖4-2-1 類別比例1比1 25

圖4-2-2 類別比例2比1 26
圖4-2-3 類別比例3比1 27


參考文獻
中文部分
1.沈宜靜、林建良、許惠恒 (2011)。「糖尿病與癌症之關聯以及台灣現況探討」。內科學誌,22(1),19-30。
2.李哲全、傳振宗、吳篤安 (2006)。「糖尿病的診斷與治療」。慈濟醫學雜誌,18(1_S),1-9。
3.陳正美、徐建業、邱泓文、白其卉、吳柏動 (2011)。「以類神經網路及分類回歸樹輔助肝癌病患預測存活情形」。臺灣公共衛生雜誌,30(5),481-493。
4.陳民虹 (2005)。「乳癌的流行病學特徵及危險因子」。澄清醫護管理雜誌,1(1),30-38。
5.鄭淑敏 (2013)。台灣地區第2型糖尿病病患降血糖藥物的治療與癌症的關聯性。高雄醫學大學藥學研究所碩士在職專班未出版碩士論文,高雄市。
6.顏秀珍、李御璽、王秋光 (2009)。「改善不平衡資料集中少數類別資料之分類正確性的方法」。電子商務學報,11(4),847-858。
中文網路部分
1.衛生福利部統計處(2015)。103年度死因統計, 取自: http://www.mohw.gov.tw/cht/DOS/Statistic.aspx?f_list_no=312&fod_list_no=5487 (2015/6/17)。

英文部分
1.Boyle, P., Boniol, M., Koechlin, A., Robertson, C., Valentini, F., Coppens, K., et. al. (2012). Diabetes and breast cancer risk: a meta-analysis. British journal of cancer, 107(9), 1608-1617.
2.Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton, USA:CRC press.
3.Cabena, P., Hadjinian, P., Stadler, R., Verhees, J., & Zanasi, A. (1998). Discovering data mining: from concept to implementation. New Jersey, USA:Prentice-Hall, Inc.
4.Dine, J., & Deng, C. X. (2013). Mouse models of BRCA1 and their application to breast cancer research. Cancer and Metastasis Reviews, 32(1-2), 25-37.
5.Emerging Risk Factors Collaboration. (2011). Diabetes mellitus, fasting glucose, and risk of cause-specific death. New England Journal Medicine, 2011(364), 829-841.
6.Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37.
7.Fonarow, G. C., Adams, K. F., Abraham, W. T., Yancy, C. W., Boscardin, W. J., & ADHERE Scientific Advisory Committee. (2005). Risk stratification for in-hospital mortality in acutely decompensated heart failure: classification and regression tree analysis. Jama, 293(5), 572-580.
8.Giovannucci, E., Harlan, D. M., Archer, M. C., Bergenstal, R. M., Gapstur, S. M., Habel, L. A., et al. (2010). Diabetes and cancer: a consensus report. CA: a cancer journal for clinicians, 60(4), 207-221.

9.Gu, X., Ni, T., & Wang, H. (2014). New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification. The Scientific World Journal, ID: 536434, Pages: 12.
10.Hu, J., He, X., Yu, D. J., Yang, X. B., Yang, J. Y., & Shen, H. B. (2014). A new supervised over-sampling algorithm with application to protein-nucleotide binding residue prediction. PloS one, 9(9), ID: 107676.
11.Joslin, E. P., Lombard, H. L., Burrows, R. E., & Manning, M. D. (1959). Diabetes and cancer. New England Journal of Medicine, 260(10), 486-488.
12.Keteepe-Arachi, T., & Sharma, S. (2016). Underestimating risk in women delays diagnosis of CVD. The Practitioner, 260(1791), 11-5.
13.Laupacis, A., & Sekar, N. (1997). Clinical prediction rules: a review and suggested modifications of methodological standards. Jama, 277(6), 488-494.
14.Law, J. H., Habibi, G., Hu, K., Masoudi, H., Wang, M. Y., Stratford, A. L., et al. (2008). Phosphorylated insulin-like growth factor-i/insulin receptor is present in all breast cancer subtypes and is related to poor survival.Cancer research, 68(24), 10238-10246.
15.Lin, T., Chou, P., Lai, M. S., Tsai, S. T., & Tai, T. Y. (2001). Direct costs-of-illness of patients with diabetes mellitus in Taiwan. Diabetes research and clinical practice, 54, 43-46.
16.Longadge, R., & Dongre, S. (2013). Class imbalance problem in data mining review. arXiv preprint arXiv, 1305, 1707.

17.Mani, I., & Zhang, I. (2003, August). KNN approach to unbalanced data distributions: A case study involving information extraction. In N. V. Chawla (Chair), Work-shop on Learning from Imbalanced Datasets. ICML 2003, Washington, DC.
18.Maynard, G. D. (1910). A statistical study in cancer death-rates. Biometrika, 7(3), 276-304.
19.Michels, K. B., Solomon, C. G., Hu, F. B., Rosner, B. A., Hankinson, S. E., Colditz, G. A., & Manson, J. E. (2003). Type 2 diabetes and subsequent incidence of breast cancer in the Nurses’ Health Study. Diabetes care, 26(6), 1752-1758.
20.Oh, S. M., Stefani, K. M., & Kim, H. C. (2014). Development and application of chronic disease risk prediction models. Yonsei medical journal, 55(4), 853-860.
21.Palaniappan, S., & Awang, R. (2008, March). Intelligent heart disease prediction system using data mining techniques. In Sheikha Abdulla Al-MisnadIn (Chair), 2008 IEEE/ACS International Conference on Computer Systems and Applications. AICCSA 2008, Doha, Qatar.
22.Pereira, S., Fontes, F., Sonin, T., Dias, T., Fragoso, M., Castro-Lopes, J., & Lunet, N. (2014). Neurological complications of breast cancer: study protocol of a prospective cohort study. BMJ open, 4(10), e006301.
23.Prather, J. C., Lobach, D. F., Goodwin, L. K., Hales, J. W., Hage, M. L., & Hammond, W. E. (1997). Medical data mining: knowledge discovery in a clinical data warehouse. American Medical Informatics Association, PMCID: PMC2233405, 101-105.
24.Reaven, G. M. (1980). Insulin-independent diabetes mellitus: metabolic characteristics. Metabolism, 29(5), 445-454.

25.Srokowski, T. P., Fang, S., Hortobagyi, G. N., & Giordano, S. H. (2009). Impact of diabetes mellitus on complications and outcomes of adjuvant chemotherapy in older patients with breast cancer. Journal of Clinical Oncology, 27(13), 2170-2176.
26.Suh, S., & Kim, K. W. (2011). Diabetes and cancer: is diabetes causally related to cancer?. Diabetes & metabolism journal, 35(3), 193-198.
27.Tabaei, B. P., & Herman, W. H. (2002). A multivariate logistic regression equation to screen for diabetes development and validation. Diabetes Care, 25(11), 1999-2003.
28.Tseng, C. H., Chong, C. K., & Tai, T. Y. (2009). Secular trend for mortality from breast cancer and the association between diabetes and breast cancer in Taiwan between 1995 and 2006. Diabetologia, 52(2), 240-246.
29.Wolf, I., Sadetzki, S., Catane, R., Karasik, A., & Kaufman, B. (2005). Diabetes mellitus and breast cancer. The lancet oncology, 6(2), 103-111.
30.Xie, X. D., Qu, S. X., Liu, Z. Z., Zhang, F., & Zheng, Z. D. (2009). Study on relationship between angiogenesis and micrometastases of peripheral blood in breast cancer. Journal of cancer research and clinical oncology, 135(3), 413-419.
31.Yen, S. J., & Lee, Y. S. (2006). Under-sampling approaches for improving prediction of the minority class in an imbalanced dataset. Intelligent Control and Automation, 344(8), 731-740.
32.Yen, S. J., & Lee, Y. S. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications, 36(3), 5718-5727.

33.Zendehdel, K., Nyrén, O., Östenson, C. G., Adami, H. O., Ekbom, A., & Ye, W. (2003). Cancer incidence in patients with type 1 diabetes mellitus: a population-based cohort study in Sweden. Journal of the National Cancer Institute, 95(23), 1797-1800.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top