(3.235.108.188) 您好!臺灣時間:2021/03/07 21:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:廖偉欽
研究生(外文):Wei-Chin Liao
論文名稱:以機器學習預測城市用水需求之研究
論文名稱(外文):Urban Water Demand Forecasting Using Machine Learning
指導教授:張智星張智星引用關係
指導教授(外文):Jyh-Shing Roger Jang
口試委員:王新民廖元甫
口試日期:2019-06-05
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:49
中文關鍵詞:機器學習城市用水需求lasso regressionridge regressionrandom forestXGBoostneural networkLSTM特徵選取
DOI:10.6342/NTU201901433
相關次數:
  • 被引用被引用:0
  • 點閱點閱:65
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
水資源是國家追求永續發展的關鍵要素,了解未來水資源需求的變化為重要課題,需水量的預測為達此目的的有效方法。本研究為月售水量的預測,屬於短期預測,針對重點為系統操作、供水管理、最佳化供水的決策問題。本研究使用機器學習中neural network、LSTM (long short term memory) 、lasso regression、ridge regression、random forest及XGBoost演算法作為售水量預測方法。以預測基隆市的月售水量為例,結果顯示所實現機器學習演算法都對售水量預測之MAPE (mean absolute percentage error) 皆於3.04%以下,顯示其對售水量能做出不錯的預測。本研究各機器學習方法比較了未經特徵選取和經特徵選取後的模型成效,其中XGBoost在未經特徵選取中的資料表現較好,而random forest則是在經特徵選取後的資料表現較好。綜合而言,對於時間性的資料預測,機器學習的演算法普遍來說能充分運用資料,並儘量抑制overfitting的發生,以達到較高的預測準確度。
Water supply is a key element in a country''s pursuit of sustainable development. Analyzing future changes in water demand is essential in optimizing water supply, and algorithmic prediction of water demand is an effective way to achieve this goal. This study aims to forecast water demand on a short-term (monthly) basis. These prediction statistics may allow for advanced water supply management technology by assisting a system''s decision making process and allowing for more efficient resource management. This study uses the neural network, LSTM (long short-term memory), lasso regression, ridge regression, random forest, and XGBoost, each of which generate unique water demand forecasting statistics. Taking the forecast of the monthly water demand in Keelung as an example, results show that these selected machine learning algorithms may reach an MAPE (mean absolute percentage error) index of below 3.04%, proving that it is an accurate prediction of water demand. In this study, the machine learning algorithms implemented compare the effects of the model with feature selection versus without feature selection. Among the chosen algorithms, XGBoost performs better without feature selection, while random forest performs optimally by using feature selection. The factor of overfitting must be taken into account. For time-based data prediction, the machine learning algorithms implemented are generally ideal in making full use of the data by suppressing the occurrence of overfitting to achieve better accuracy.
誌謝 i
摘要 ii
ABSTRACT iii
目錄 iv
圖目錄 vi
表目錄 viii
Chapter 1 緒論 1
1.1 主題簡介 1
1.2 方法簡介 1
1.3 章節概述 2
Chapter 2 研究方法 3
2.1 城市用水 3
2.2 時間序列 5
2.3 機器學習 8
Chapter 3 實驗設置與結果 19
3.1 時間序列 21
3.2 機器學習 22
3.2.1 Lasso regression 24
3.2.2 Ridge regression 26
3.2.3 Random forest 29
3.2.4 XGBoost 31
3.2.5 小結 33
3.2.6 Neural network 35
3.2.7 LSTM 38
3.2.8 溫度因子 40
Chapter 4 結論與未來展望 43
參考文獻 45
[1]楊偉甫. "台灣地區水資源利用現況與未來發展問題." 台灣水環境再生協會, 用水合理化與新生水水源開發論壇 (2010).
[2]台灣自來水公司六年(107~112)經營計畫,2017
[3]张雅君, and 刘全胜. 需水量预测方法的评析与择优. Diss. 2001.
[4]姚榮昇. "每月平均日計費計量水量預測模型之建立." 臺北科技大學土木與防災研究所學位論文 (2014): 1-109.
[5]Adamowski, Jan, et al. "Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada." Water Resources Research 48.1 (2012).
[6]Bougadis, John, Kaz Adamowski, and Roman Diduch. "Short‐term municipal water demand forecasting." Hydrological Processes: An International Journal 19.1 (2005): 137-148.
[7]Donkor, Emmanuel A., et al. "Urban water demand forecasting: review of methods and models." Journal of Water Resources Planning and Management 140.2 (2012): 146-159.
[8]Liemberger, Roland, and Malcolm Farley. "Developing a nonrevenue water reduction strategy Part 1: Investigating and assessing water losses." IWA WWC Conference, Marrakech, Morocco, Sept. 2004.
[9]"How To Identify Patterns in Time Series Data: Time Series Analysi." 2019, from http://www.statsoft.com/Textbook/Time-Series-Analysis.
[10]"Stationarity and differencing." 2019, from https://people.duke.edu/~rnau/411diff.htm.
[11]"engineering statistics handbook." 2019, from https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc442.htm.
[12]"Time Series Analysis - 時間序列模型基本概念:AR, MA, ARMA, ARIMA 模型." 2019, from https://mropengate.blogspot.com/2015/11/time-series-analysis-ar-ma-arma-ARIMA.html.
[13]Bozdogan, Hamparsum. "Model selection and Akaike''s information criterion (AIC): The general theory and its analytical extensions." Psychometrika 52.3 (1987): 345-370.
[14]Mohammed, A. A., Christopher Naugler, and Behrouz H. Far. "Emerging business intelligence framework for a clinical laboratory through big data analytics." Emerging trends in computational biology, bioinformatics, and systems biology: algorithms and software tools. New York: Elsevier/Morgan Kaufmann
[15]Abu-Mostafa, Yaser S., Malik Magdon-Ismail, and Hsuan-Tien Lin. Learning from data. Vol. 4. New York, NY, USA:: AMLBook, 2012.
[16]"Ridge and lasso regression: A Complete Guide with Python Scikit-Learn." 2019, from https://towardsdatascience.com/ridge-and-lasso-regression-a-complete-guide-with-python-scikit-learn-e20e34bcbf0b.
[17]Tibshirani, Robert. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological) 58.1 (1996) 267-288.
[18]Liaw, Andy, and Matthew Wiener. "Classification and regression by randomForest." R news 2.3 (2002): 18-22.
[19]Dimitriadis, Stavros I., Dimitris Liparas, and Alzheimer''s Disease Neuroimaging Initiative. "How random is the random forest? random forest algorithm on the service of structural imaging biomarkers for Alzheimer''s disease: from Alzheimer''s disease neuroimaging initiative (ADNI) database." neural regeneration research 13.6 (2018): 962.
[20]Rodriguez-Galiano, Victor Francisco, et al. "An assessment of the effectiveness of a random forest classifier for land-cover classification." ISPRS Journal of Photogrammetry and Remote Sensing 67 (2012): 93-104.
[21]Pal, Mahesh. "random forest classifier for remote sensing classification." International Journal of Remote Sensing 26.1 (2005): 217-222.
[22]Strobl, Carolin, et al. "Bias in random forest variable importance measures: Illustrations, sources and a solution." BMC bioinformatics 8.1 (2007): 25.
[23]Qi, Yanjun. "random forest for bioinformatics." Ensemble machine learning. Springer, Boston, MA, 2012. 307-323.
[24]"feedforward fully connected neural network | matlab." 2019, from https://stackoverflow.com/questions/53215804/feedforward-fully-connected-neural-network-matlab.
[25]Liu, D. "A Practical Guide to ReLU." 2019, from https://medium.com/tinymind/a-practical-guide-to-relu-b83ca804f1f7.
[26]Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).
[27]Bushaev, V. "Adam — latest trends in deep learning optimization." 2019, from https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c.
[28]Chen, Tianqi, and Carlos Guestrin. "XGBoost: A scalable tree boosting system." Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, 2016.
[29]"Introduction to Boosted Trees." 2019, from https://XGBoost.readthedocs.io/en/latest/tutorials/model.html.
[30]Schmidhuber, Jürgen. "Deep learning in neural networks: An overview." neural networks 61 (2015): 85-117.
[31]"Everything you need to know about neural networks." 2019, from https://opensourceforu.com/2017/03/neural-networks-in-detail/.
[32]"Evolution: from vanilla RNN to GRU & LSTMs." 2019, from https://docs.google.com/presentation/d/1UHXrKL1oTdgMLoAHHPfMM_srDO0BCyJXPmhe4DNh_G8/pub?start=false&loop=false&delayms=3000&slide=id.g24bcaec3b0_0_4206.
[33]Gers, Felix A., Jürgen Schmidhuber, and Fred Cummins. "Learning to forget: Continual prediction with LSTM." (1999): 850-855.
[34]Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." neural computation 9.8 (1997): 1735-1780.
[35]Wang, Yu. "A new concept using LSTM neural networks for dynamic system identification." 2017 American Control Conference (ACC). IEEE, 2017.
[36]"From scratch — An LSTM model to predict commodity prices." 2019, from https://medium.com/@vinayarun/from-scratch-an-lstm-model-to-predict-commodity-prices-179e12445c5a.
[37]Bermingham, Mairead L., et al. "Application of high-dimensional feature selection: evaluation for genomic prediction in man." Scientific reports 5 (2015): 10312.Allen, T. and A. Nowak (2009).
[38]"Feature Selection Methods." 2019, from http://mirlab.org/jang/books/dcpr/fsMethod.asp?title=10-2%20Feature%20Selection%20Methods%20(%AFS%BCx%BF%EF%A8%FA%A4%E8%AAk)&language=all.
[39]Fonti, Valeria, and Eduard Belitser. "Feature selection using lasso." VU Amsterdam Research Paper in Business Analytics (2017).
[40]Menze, Bjoern H., et al. "A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data." BMC bioinformatics 10.1 (2009): 213.
[41]RIPLEY, Brian D. Pattern recognition and neural networks. Cambridge university press, 2007.
[42]"台灣自來水公司第一區管理處官方網站." from https://www1.water.gov.tw/ch/09water/wat_a01_list.asp?cate_id=169.
[43]Zulkifli, H. "Multivariate Time Series Forecasting Using random forest." 2019, from https://towardsdatascience.com/multivariate-time-series-forecasting-using-random-forest-2372f3ecbad1.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔