( 您好!臺灣時間:2021/02/25 07:39
字體大小: 字級放大   字級縮小   預設字形  


論文名稱(外文):Using Data Mining to Forecast Medical Resource Consumption of Diabetic Nephropathy Patients
外文關鍵詞:data miningmedical resource consumptiondiabetic nephropathymultivariate adative regression splinessupport vector regression
  • 被引用被引用:0
  • 點閱點閱:198
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
本研究使用五個技術包括多元迴歸、逐步迴歸、多元適應性雲形迴歸(multivariate adaptive regression splines, MARS)、支援向量迴歸(support vector regression, SVR)及兩階段模型(T-SVR)等技術建構預測模型。其中T-SVR模式是先透過逐步迴歸及MARS,篩選出與糖尿病腎病變相關疾病或造成醫療耗用之重要變數,將這些變數聯集後得到SVR模式的預測變數。

Diabetes has become an important public health issue in the twenty-first century and dialysis treatment has become a large burden on the National Health Insurance (NHI) system of Taiwan, and diabetic nephropathy (DN) is the leading factor that affects whether diabetic patients need dialysis treatment. The rate of end-stage renal disease (ESRD) in Taiwan is the highest in the world. Statistics produced by the Ministry of Health and Welfare in 2015 indicate that chronic kidney failure (uremia) was the second highest cause of visits to primary outpatient clinics in 2015, second only to cancer. According to the National Health Insurance Administration at the Ministry of Health and Welfare, 6% of the health insurance budget is spent on dialysis treatment for ESRD patients. Since there have been few studies on the medical resources consumed by diabetic nephropathy in Taiwan. Therefore, this study proposes a forecasting model for DN patients.
In this study, we used five techniques including multiple regression, stepwise regression, multivariate adaptive regression splines (MARS), support vector
regression (SVR) and a two stage model (T-SVR), to establish a model for predicting the medical resources consumption of diabetic nephropathy patients. To construct the T-SVR model, the input variables we used are the union of variables that are identified as important variables by stepwise regression and MARS for constructing T-SVR model.
The results of comparing these models show that the best performance was odtained usuing SVR, followed by the T-SVR model. In addition, we found five important variables, namely hypertension disease, dyslipidemia disease, cardiovascular disease, cerebrovascular disease and kidney disease excluding DN.
The results in this paper identify the important factors that have a significant impact on medical resources consumption, as well as the model with the best forecasting performance of all the data mining techniques. This study can provide suggestions of medical institutions for allocating medical resources and controlling medical resources consumption, so that the medical resources can be allocated more suitably and effectively.

1.1 Research Background 1
1.2 Motivation 3
1.3 Purposes of the study 4
1.4 Study process 5
2.1 Medical Resource Consumption 7
2.1.1 Medical Consumption of Diabetic 7
2.1.2 Treatment of DN 8
2.1.3 Medical Consumption of Dialysis 10
2.2 Data Mining 10
2.3 MARS 11
2.4 SVR 13
3.1 Conceptual Framework 15
3.2 MARS 23
3.3 SVR 28
3.4.1 Structure of SVR and minimize regularized risk function 30
3.4.2 SVR Parameter Setting 31
4.1 Descriptive Statistics 33
4.2 Multiple Regression Results 34
4.3 Stepwise Regression Results 35
4.4 MARS Results 37
4.5 SVR Results 38
4.6 T-SVR Results 40
4.7 Models Comparison Results 43
5.1 Conclusion of the Results 44
5.2 Recommendations for Future Research 45

List of Tables
Table 1-1-1 2006-2016 Dialysis Incident Rate in Taiwan 2
Table 2-1-1 Stages of Diabetic Nephropathy 8
Table 3-1-1 List Of Prediction Variables in Building Prediction Models. 19
Table 3-1-2 Literature of forecasting variables 20
Table 4-1-1 Data describe of objects 33
Table 4-1-2 The Frequency of each Prediction Variables 34
Table 4-2-1 The model result of the Multiple Regression model 35
Table 4-3-1 Result of the Stepwise Regression model(1) 36
Table 4-3-2 Final Result of the Stepwise Regression model 37
Table 4-4-1 Important Prediction Variables of Using MARS 38
Table 4-5-1 Results of SVR Model Parameter Adjustment Combination 39
Table 4-6-1 Results of T-SVR Model Parameter Adjustment Combination 41
Table 4-7-1 Results of Comparing Models 43

List of Figures
Figure 1-4-1 Study Process 6
Figure 3-2-1 Conceptual framework of forecasting model 16
Figure 3-3-1 Piecewise Linear Basis Function 24
Figure 3-3-2 MARS model schematic diagram 26
Figure 3-4-1 Schematic Diagram of SVR 29

1.Abraham, A., Steinberg, D., & Philip, N. S. (2001). Rainfall forecasting using soft computing models and multivariate adaptive regression splines. IEEE SMC Transactions, Special, 1-12.
2.Almdal, T., Scharling, H., Jensen, J. S., & Vestergaard, H. (2004). The independent effect of type 2 diabetes mellitus on ischemic heart disease, stroke, and death: A population-based study of 13,000 men and women with 20 years of follow-up. Arch Intern Med, 164(13), 1422-1426.
3.Al‐Rubeaan, K., El‐Asrar, A., Ahmed, M., Youssef, A. M., Subhani, S. N., Ahmad, N. A., Alguwaihes, A., Alotaibi, M. S., Al-Ghamdi, A., & Ibrahim, H. M. (2014). Diabetic retinopathy and its risk factors in a society with a type 2 diabetes epidemic: A Saudi National Diabetes Registry‐based study. PLoS One, 93(2), 140-147.
4.Brandle, M., Zhou, H. H., Smith, & Barbara, R. K. (2003). The direct medical cost of type 2 diabetes. Diabetes Care, 26(8), 2300-2304.
5.Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(27), 1-27.
6.Chang, C. S. (2009). Evaluating the performance of different wavelet basis functions and levels in forecasting financial time series using MARS and SVR. Unpublished doctoral dissertation, National Taipei technology university Department of Business and management, Taipei City.
7.Chang, T. J., Jiang, Y. D., & Chang, C. H. (2012). Accountability, utilization and providers for diabetes management in Taiwan, 2000-2009: An analysis of the National Health Insurance database. Journal Formosa Medical Association, 111(11), 605-16.
8.Chen, C. Y. (2003). Forecasting the Unemployment Rate Using Artificial Neural Networks and Multivariate Adaptive Regression Splines. Unpublished doctoral dissertation, Department of Business Administration, Fu Jen Catholic University, New Taipei City.
9.Chen, K. Y. (2006). Integrating Genetic Algorithms and Support Vector Regression for TAIEX Forecasting. Journal of Measurement Management, 3(1), 1-18.
10.Chen, T. M., Tung, M. S., Chuang, C. Y., & Guan, D. W. (2013). Neural network analysis for hot thunderstorm occurrence probability and its application in northern Taiwan (CAA-ANWS-102-3-02). Taiwan: Ministry of Transportation and Communication.
11.Cherkassky, V., & Ma, Y. (2004). Practical selection of SVM parameters and noise estimation for SVM regression. Neural Networks, 17(1), 113-126.
12.Christiansen, J. S., Gammelgaard, J., Tronier, B., Svendsen, P. A., & Parving, H. H. (1982). Kidney function and size in diabetics before and during initial insulin treatment. Kidney international, 21(5), 683-688.
13.DeBoer, I. H., Rue, T. C., Cleary, P. A., Lachin, J. M., Molitch, M. E., Steffes, M. W., Sun, W., Zinman, B., Brunzell, J. D., White, N. H., Danis, R. P., Davis, M. D., Hainsworth, D., Hubbard, L. D., & Nathan, D. M. (2011). Long -term renal outcomes of patients with type 1 diabetes mellitus and microalbuminuria: An analysis of the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications cohort. Archives of Internal Medicine, 171(5), 412 -420.
14.Fabbian, D., de Dear, R., & Lellyett, S. (2006). Application of Artificial Neural Network Forecast to Predict Fog at Canberra International Airport. Weather Forecasting, 22, 372-381.
15.Friedman, J. H. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19, 1-141.
16.Friedman, J. H., & Roosen, C. B. (1995). An introduction to multivariate adaptive regression splines. Statistical Methods in Medical Research, 4, 197-217.
17.Fu, J. C., Huang, H. Y., Chu, Y. H., Jang, J. H., & Hsu, M. H. (2011). Application of Multivariate Adaptive Regression Spline (MARS) Modeling in Rainfall-river Stage Forecasting, 2011, Journal of Taiwan Water Conservancy, 59(3).
18.Fu, C. C., Chang, C. J., Tseng, C. H., Chen, M. S., Kao, C. S., Wu, T. J., Wu, H. P., Chuang, L. M., Chen, C. J., & Tai, T. Y. (1993). Development of macrovascular diseases in NIDDM patients in northern Taiwan: A 4-year follow-up study. Diabetes Care, 16(1), 137-143.
19.Go, A. S., Chertow, G. M., Fan, D., McCulloch, C. E., & Hsu, C. Y. (2004). Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. New England Journal of Medicine, 351(13), 1296-1305.
20.Hägg, S., Thorn, L. M., Forsblom, C. M., Gordin, D., Saraheimo, M., & Tolonen, N. (2014). Different risk factor profiles for ischemic and hemorrhagic stroke in type 1 diabetes mellitus. Stroke, 45(9), 2558-2562.
21.Hamzacebi, C., Akay, D., & Kutay, F. (2009). Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting. Expert Systems with Applications, 36, 3839-3844.
22.He, K., Lai, K. K., & Yen, J. (2010). A hybrid slantlet denoising least squares support vector regression model for exchange rate prediction. Procedia Computer Science, 1, 2397 -2405.
23.Hong, W. C., Dong, Y., Chen, L. Y., & Lai, C. Y. (2010).Taiwanese 3G mobile phone demand forecasting by SVR with hybrid evolutionary algorithms. Expert Systems with Applications, 37, 4452-4462.
24.Hsu, J. Y. (2001). Using Neural Network and MARS in The Classification Task of Data Mining. Unpublished doctoral dissertation, Department of Statistic, Fu Jen Catholic University, New Taipei City.
25.Hu, Y. N., Chang, H. Y., & Chen, C. Y. (2004). Using Time Series Data and Macroeconomic Variables in Unemployment Rate Forecasting. Yu Da Academic Journal, 8, 139-170.
26.Huang, M. H. (2002). Mining The Performance Of Mutual Funds Using Neural Networks And Multivariate Adaptive Regression Splines. Unpublished doctoral dissertation, Department of Finance, Fu Jen Catholic University, New Taipei City.
27.Huang, W., Huang, J., Liu, Q., Lin, F., He, Z., Zeng, Z., & He, L. (2014). Neutrophil–lymphocyte ratio is a reliable predictive marker for early‐stage diabetic nephropathy. Clinical endocrinology, 82(2), 229-233.
28.Ilan, A., Min, Q., & Sadowski, R. J. (2001). Forecasting aggregate retail sales: A comparison of artificial neural networks and traditional methods. Journal of Retailing and Consumer Services, 8, 147-156.
29.Iseki, K., Ikemiya, Y., Inoue, T., Iseki, C., Kinjo, K., & Takishita, S. (2004). Significance of hyperuricemia as a risk factor for developing ESRD in a screened cohort. American journal of kidney diseases, 44(4), 642-650.
30.Ito, S. (2010). Treatment strategies according to the stage of diabetic nephropathy. Nihon Rinsho, 68(9), 465-471.
31.Jeerakathil, T., Johnson, J. A., Simpson, S. H., & Majumdar, S. R. (2007). Short-term risk for stroke is doubled in persons with newly treated type 2 diabetes compared with persons without diabetes a population-based cohort study. Stroke, 38(6), 1739-1743.
32.Johnson, R.J., Kivlighn, S.D., Kim, Y.G., Suga, S., & Fogo, A.B. (1999). Reappraisal of the pathogenesis and consequences of hyperuricemia in hypertension, cardiovascular disease and renal disease. American Journal Kidney Disease. 33, 225-234.
33.Kanas, A., & Yannopoulos, A. (2001). Comparing linear and nonlinear forecasts for stock returns. International Review of Economics and Finance, 10, 383-398.
34.Kang, D. H., & Nakagawa, T. (2005). Uric Acid and Chronic Renal Disease. Possible implication of hyperuricemia on progression of renal disease. Seminars in Nephropathy. 25, 43-49.
35.Kang, D. H., & Nakanawa, T., (2005). Uric acid and chronic renal disease: Possible implication of hyperuricemia on progression of renal disease. Seminar in Nephrology, 25(1), 43-49.
36.Kiran, N. J., & Ravi, V. (2008). Software reliability prediction by soft computing techniques. The Journal of Systems and Software, 81, 576 -583.
37.Koike, A., & Takagi, T. (2004). Prediction of protein -protein interaction sites using support vector machines. Protein Engineering Design & Selection. 17(2), 165 -173.
38.Lee, T. S., & Tang, H. C. (2004). Incorporating finacial ratios and intellectual capital in business failure predictions using artificial neural networks and multivariate adaptive regression splines. Journal of Information Management, 11(2), 161-189.
39.Leernders, M. R., Fearon, H. E., Flynn, A. E., & Johnson, P. F. (2002). Purchasing and supply management (Eds). New York: McGraw-Hill.
40.Levey, A. S., & Coresh, J. (2012). Chronic kidney disease. Lancet, 379(9811), 165 -180.
41.Lian, C. M. (2012). Decision Support Based Sales Forecasting for Information Technology Product Channel Industry. Unpublished doctoral dissertation, Department of Business Administration, Fu Jen Catholic University, New Taipei City.
42.Liao, D. Y. (2014). Using Fruit Fly Optimization Algorithm and Support Vector Regression to build the Prediction Models for Financial Distress -A Case Study of Listed Companies in Taiwan. Unpublished doctoral dissertation, Department of Economy, Soochow University, Taipei City.
43.Lin, T., Chou, P., Lai M. S., Tsai, S. T., & Tai, T. Y. (2005). Direct costs-of-illness of patients with diabetes mellitus in Taiwan. Diabetes Research Clinical Practice, 54 (supply 1), 43-6.
44.Lin, C. J., & Lee, T. S. (2013). Tourism demand forecasting: Econometric model based on multivariate adaptive regression splines, artificial neural network and support vector regression. Advances in Management and Applied Economics, 3(6), 1-18.
45.Lin, C. J., Chen, H. F., & Lee, T. S. (2011). Forecasting tourism demand using time series, artificial neural networks and multivariate adaptive regression splines: Evidence from Taiwan. Internation Journal of Business Administration, 2(2), 14-24.
46.Lin, H. T. (2011). Determining the Contributors of Mean or Variance Shifts for a Multivariate Process using Intelligent Hybrid Approaches. Unpublished doctoral dissertation, Department of Statistic Fu Jen Catholic University, New Taipei City.
47.Liu, C. K. (2015). Applying Data Mining Techniques for Constructing Disease Risk Factor Analysis Model– The Case for Diabetic Nephropathy and Dialysis. Unpublished doctoral dissertation, Department of Business Administration Fu Jen Catholic University, New Taipei City.
48.Lu, Y. H. (2000). Using back propagation network, multivariate adaptive regression splines, and autoregressive integrated moving average to design forecasting models for stock price. Unpublished doctoral dissertation, Department of Business Administration, Fu Jen Catholic University, New Taipei City.
49.Mitchell, T. J., & Beauchamp, J. J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83(404), 1023-1032.
50.Mogensen, C. E., & Christensen, C. K. (1984). Predicting diabetic nephropathy in insulin-dependent patients. New England Journal of Medicine, 311(2), 89-93.
51.Mogensen, C. E., Christensen, C. K., & Vittinghus, E. (1983). The stages in diabetic renal disease: With emphasis on the stage of incipient diabetic nephropathy. Diabetes, 32(Supply 2), 64-78.
52.Mooradian, A. D. (2009). Dyslipidemia in type 2 diabetes mellitus. Nature clinical practice endocrinology & metabolism, 5(3), 150-159.
53.Unwin, N., Guariguata, L., Whiting, D., & Weil, C. (2012) . Complementary approaches to estimation of the global burden of diabetes. The Lencet. 379(9825), 1487-1488.
54.Nugroho, A., Kuroyanagi, S., & Iwata, A. (2002). A Solution for Imbalanced Training Sets Problem by CombNET-II and Its Application on Fog Forecasting. IEICE Trans Inf Syst, E85(7), 1165-1174.
55.Pai , P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting, Omega, 33, 497-505.
56.Pai, P. F., & Lin, C. S. (2006). Using support vector machines in forecasting production values of machinery industry in Taiwan. International Journal of Advanced Manufacturing Technology, 27, 205-210.
57.Pai, P. F., & Hong, W. C. (2006). Software reliability forecasting by support vector machines with simulated annealing algorithms. Journal of Systems and Software, vol. 79, 747-755.
58.Qi, J., Hu, J., Peng, Y. H., & Ren, Q. (2011). Electrical evoked potentials prediction model in visual prostheses based on support vector regression with multiple weights. Applied Soft Computing, 11(8), 5230-5242.
59.Ravid, M., Brosh, D., Ravid-Safran, D., Levy, Z., & Rachmani, R. (1998). Main risk factors for nephropathy in type 2 diabetes mellitus are plasma cholesterol levels, mean blood pressure, and hyperglycemia. Archives of internal medicine, 158(9), 998-1004.
60.Schölkopf, & Smola, A. J. (2002). Learning with kernel. Cambridge, MA: MIT Press.
61.Sharda, R., Patil, R.B., & Manuf, J. I. (1992). Connectionist approach to time series prediction: an empirical test. Journal of Intelligent Manufacturing. 3(5) , 317–323.
62.Shelbaya, S., Amer, H., Seddik, S., Allah, A. A., Sabry, I. M., Mohamed, T., & El-Mosely, M. (2012). Study of the role of interleukin-6 and highly sensitive C-reactive protein in diabetic nephropathy in type 1 diabetic patients. Eur Rev Med PharmacolSci, 16(2), 176-182.
63.Singhal, D., & Swarup, K. S. (2011). Electricity price forecasting using artificial neural networks. Electrical Power and Energy Systems, 33, 550 -555.
64.Smith, D. H., Gullion C. M., Nichols G., Keith D. S., & Brown J. B. (2004). Cost of medical care for chronic kidney disease and comorbidity among enrollees in a large HMO population. J Am Soc Nephrol, 15, 1300-1306.
65.Song, H., & Li, G. (2008). Tourism demand modeling and forecasting – A review of recent research. Tourism Management, 29, 203-220.
66.Steinberg, D., Bernard, B., Phillip, C., & Kerry, M. (1999). MARS user guide. San Diego, CA:Salford Systems.
67.Sung Y. H., (2013). Medical utilization and affecting factors of DM nephropathy and DM retinopathy. Unpublished doctoral dissertation, Kaohsiung medical university Department of Healthcare Administration and Medical Informatics, Kaohsiung.
68.Tannus, L. R. M., Drummond, K. R. G., da Silva Clemente, E. L., da Matta, M. D. F. B., & Gomes, M. B. (2014). Predictors of cardiovascular autonomic neuropathy in patients with type 1 diabetes. Frontiers in endocrinology, 5, ID: 191.
69.Tolonen, N., Forsblom, C., Mäkinen, V. P., Harjutsalo, V., Gordin, D., Feodoroff, M., Sandholm, N., Thorn, L. M., Wadén, J., Taskinen, M. R., Groop, P. H., (2014). Different lipid variables predict incident coronary artery disease in patients with type 1 diabetes with or without diabetic nephropathy: The FinnDiane Study. Diabetes care, 37(8), 2374-2382.
70.Unwin, N., Guariguata, L., & Weil, C. (2012), Complementary approaches to estimation of the global burden of diabetes, The Lancet, 379(9825), 1487-1488.
71.Vapnik, V. N. (2000). The Nature of Statistical Learning Theory, New York: Springer.
72.Vapnik, V. N., Golowich S. & Smola, A. (1997). Support vector method for function approximation, regression estimation and signal processing. Advance in Neural information processing system 9. Cambridge, MA: MIT Press.
73.Vijan, S., & Hayward, R. A. (2003). Treatment of hypertension in type 2 diabetes mellitus: Blood pressure goals, choice of agents, and setting priorities in diabetes care. Annals of internal medicine, 138(7), 593-602.
74.Voulgari, C., Katsilambros, N., & Tentolouris, N. (2011). Smoking cessation predicts amelioration of microalbuminuria in newly diagnosed type 2 diabetes mellitus: A 1-year prospective study. Metabolism, 60(10), 1456-1464.
75.Worku, D., Hamza, L., & Woldemichael, K. (2010). Patterns of diabetic complications at jimma university specialized hospital: southwest ethiopia. Ethiopian journal of health sciences, 20(1), 33-39.
76.Yang, H. Q., Chan, L. W., & King, I. (2002). Support vector machine regression for volatile stock market prediction. Intelligent Data Engineering and Automated Learning, 391-396.
77.Yang, W. C., Hwang, S. J., Chiang, S. S., Chen, H. F., & Tsai, S. T. (2001). The impact of diabetes on economic costs in dialysis patients: experiences in Taiwan. Diabetes research and clinical practice, 54(Supply 1), 47-54.
78.Yang, Y., & Liu, X. (1999). A re-examination of text categorization methods. Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. Berkeley: California, United States.
79.Yeh, H. R. (2011). Effects of pre-dialysis access creation for ESRD patients on health care utilization and dialysis quality. Unpublished doctoral dissertation, Department of Public Health China Mediacal University, Taichung City.
80.Zambrano-Galván, G., Reyes-Romero, M. A., Lazalde, B., Rodríguez-Morán, M., & Guerrero-Romero, F. (2014). Risk of microalbuminuria in relatives of subjects with diabetic nephropathy: A predictive model based on multivariable dimensionality reduction approach. Clinical nephrology, 83(2), 86-92.
81.Zareipour, H., Cañizares, C. A., Bhattacharya, K., Thomson, J., (2006). Application of public-domain market information to forecast Ontario's wholesale electricity prices. IEEE Transactions on Power Systems, 21(4) 1707-1717.

電子全文 電子全文(網際網路公開日期:20220924)
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔