研究生(外文):Zhe-Huai Yang
論文名稱(外文):Analysis Central Processung Unit usage to find anomalies
外文關鍵詞:data miningfuzzy theoryfuzzy linguistic summarydecision tree
In recent years, the system setting errors and the increasing number of network attacks may cause abnormalities in the network equipment and affect the performance of the equipment in providing normal services. So, this study proposes to detect abnormalities in the equipment in the shortest possible time. The most common way to detect abnormalities is to analyze the change of CPU usage rate in the enterprise. This study finds out the usage habits of different time periods and their corresponding thresholds through the analysis of big data. Then makes dynamic fine-tuning of the thresholds for each time period by predicting the usage rate for each time period to strengthen the accuracy of monitoring abnormalities. In addition, the predicted usage rate can be used to establish an early warning function. So that users can immediately know whether abnormal usage behavior occurs and then detect the system early and stop the impact of abnormal phenomena.
Nowadays, monitoring software has become a basic element of business management. And, many Taiwan companies use the less expensive and open source network monitoring software Nagios or Cacit. Both of them can only set a single threshold value for monitoring a single computer hardware resource (network traffic or CPU usage).
For a company, although the overall work and the daily behavior is similar every day. There are obviously different usage habits at different times of the day. There are some disadvantages of using a single threshold value. When the threshold value is set too high, it will not be able to generate alarm information. And the result will cause users not be able to detect abnormalities as early as possible. When the threshold value is set too low. The delayed system may keep sending alarm notifications, which will cause users to detect abnormalities at all times. We have developed an algorithm to establish multiple thresholds for each time period by using historical data, and each time period has three thresholds. The prediction model built by our training data is validated. The prediction results can accurately detect the abnormal conditions and predict the trend of threshold values for each time period.
第一章 緒論 1
1-1 研究動機與目的 1
1-2 研究架構 4
第二章 文獻探討 5
第三章 問題陳述與研究方法 7
3-1 問題陳述 7
3-2 研究方法 10
3-2-1 異常值排除 12
3-2-2 建立群組 14
3-2-3 多重門檻值 18
第四章 實驗結果 24
4-1 建立模型 24
4-2 驗證模型 29
第五章 結論與未來展望 33
參考文獻 34
附錄一 11月測試資料之實驗結果 35
附錄二 英文論文 38
圖1-1 CPU使用率預測圖 2
圖3-1 X公司2天上班時段的8點~11點實際圖 7
圖3-2 X公司2天上班時段的11點~14點實際圖 8
圖3-3 X公司2天上班時段的14點~17點實際圖 8
圖3-4 X公司2天上班時段的17點~19點實際圖 9
圖3-5 建立行為模式流程圖 11
圖3-6 含異常使用率圖 13
圖3-7 群組範圍修改前資料 16
圖3-8 群組範圍修改後資料 16
圖3-9 30分鐘群組範圍修改前資料 17
圖3-10 30分鐘群組範圍修改後資料 17
圖3-11 11月6日8點~11點預測模型 20
圖3-12 11月6日11點~14點預測模型 20
圖3-13 11月6日14點~18點預測模型 21
圖3-14 11月18日8點~11點預測模型 22
圖3-15 11月18日11點~14點預測模型 22
圖3-16 11月18日14點~18點預測模型 23
圖4-1 含異常值群組分布 25
圖4-2 排除異常值後群組分布 25
圖4-3 11月15日8點~11點訓練模型 27
圖4-4 11月15日11點~14點訓練模型 27
圖4-5 11月15日14點~18點訓練模型 28
圖4-6 11月12日14點~16點含異常值訓練模型 29
圖4-7 11月23日8點~11點測試模型 30
圖4-8 11月23日11點~14點測試模型 30
圖4-9 11月23日14點~18點測試模型 31
圖4-10 11月26日11點~13點含異常值測試模型 32
