( 您好!臺灣時間:2021/08/02 21:52
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


論文名稱(外文):A Study of Applying Clustering and Outlier Detection Methods for Auditing Completed Contract Items
指導教授(外文):Tzu-Tsung Wong
外文關鍵詞:Completed contract items auditdensity-based clusteringdistance-based outlier detectionoutlier analysis
  • 被引用被引用:1
  • 點閱點閱:207
  • 評分評分:
  • 下載下載:13
  • 收藏至我的研究室書目清單書目收藏:0
In order to promote domestic economic development, improve the quality of people’s life, and create boom in private industry, large enterprises allocate huge constructing budget every year. Audit mechanism is then used to control the quality and cost of construction. Although many enterprises have developed “Engineering Management Information System” to monitor these projects, auditing tasks are generally performed manually. This study attempts to use data mining tools for detecting outliers to assist in auditing construction records such that the quality and the cost of construction can be stabilized. First, we collect the literature in construction-related fields and explore current construction management and auditing mechanisms. Then a density-based clustering algorithm is employed to generate clusters and outliers. The validities of clusters are tested by their silhouette coefficients. The clusters are further grouped by domain experts to reflect their practical meanings. The average distance calculated from the outliers is used as a threshold to divide them into abnormal and rare cases. The experimental results show that the detection rate of abnormal cases can be up to 97% when the false alarm rate is only approximately 1%. These results demonstrate that the framework proposed by this study can assist in auditing construction records to control construction costs effectively.
中文摘要 I
英文摘要 II
誌謝 III
目錄 IV
圖目錄 VI
表目錄 VII
第一章 緒論 1
1.1 研究動機 1
1.2 研究目的 2
1.3 研究架構 3
1.4 研究範圍 3
第二章 文獻探討 4
2.1 工程品質管理制度 4
2.2 工程品質查核分析 6
2.3 工程資訊系統導入 7
2.4 資料探勘 10
2.4.1 分群處理流程(Clustering procedure) 10
2.4.2 分群方法(Clustering methods) 14
2.4.3 離群值分析(Outlier analysis) 17
第三章 研究方法 19
3.1 資料前置處理 20
3.2 分群演算法及評估測度 23
3.2.1 密度式分群演算 23
3.2.2 群集評估 26
第四章 實證研究 30
4.1 資料前置處理及說明 30
4.2 分群測試及評估 32
4.3 離群值的距離計算及偵測 36
第五章 結論與建議 38
5.1 結論 38
5.2 未來研究方向與建議 40
參考文獻 41
行政院公共工程委員會(2005),公共工程資訊系統示範計畫 94 年度執行成果報告。
行政院公共工程委員會(2003),施工階段G2B 工程進度管制 XML 資料及 BPS 流程標準建置指引。
Agrawal, R., Gehrke, J., Gunopulos, D., and Raghavan, P. (1998). Automatic subspace clustering of high dimensional data for data mining applications. Proceedings of the ACM SIGMOD International Conference on Management of Data, 94-105.
Ankerst, M., Breunig, M. M., Kriegel, H. P., and Sander, J. (1999). OPTICS: Ordering points to identify the clustering structure. Proceedings of the ACM SIGMOD International Conference on Management of Data, 49-60.
Barrera, I. D., Bohacek, S., and Arce, G. R. (2010). Statistical detection of congestion in routers. IEEE Transactions on Signal Processing, 58(3), 957-968.
Barnett , V. and Lewis, T. (1994). Outliers in Statistical Data, New York: John Wiley Sons.
Cheng, P. G. (2010). Research on outlier data mining algorithms based on subspace. Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), 2, 355-357.
Cherednichenko, S. (2005). Outlier Detection in Clustering. Master's Thesis, University of Joensuu Department of Computer Science.
Ertoz, L., Eilertson, E., Lazarevic, A., Tan, P., Srivastava, J., Kumar, V., and Dokas, P. (2004) The MINDS – Minnesota Intrusion Detection System. Next Generation Data Mining, 3. Boston: MIT Press.
Eskin, E. (2000). Anomaly detection over noisy data using learned probability distributions. Proceedings of the International Conference on Machine Learning, 255-262.
Ester, M., Kriegel, H. P., Sander, J., and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 226-231.
Fang, Y. X. and Wang, J. H. (2011). Penalized cluster analysis with applications to family data. Computational Statistics & Data Analysis, 55(6), 2128-2136.
Han, J. and Kamber, M. (2000). Data Mining:Concepts and Techniques. San Francisco: Morgam Kaufmann Publishers.
Hawkins, D. M. (1980). Identification of Outliers. London: Chapman and Hall.
Hossain, M. B., Patras, A., Barry-Ryan, C., Martin-Diana, A. B., and Brunton, N. P. (2011). Application of principal component and hierarchical cluster analysis to classify different spices based on in vitro antioxidant activity and individual polyphenolic antioxidant compounds. Journal of Functional Foods, 3(3), 179-189.
Jain, A. K. and Dubes, R. C. (1988). Algorithms for Clustering Data. New Jersey: Prentice Hall.
Li, Y. and Kitagawa, H. (2007). DB-outlier detection by example in high dimensional datasets. Proceedings of the IEEE International Workshop on Databases for Next Generation Researchers, 73-78.
Li, J. W. and Chen, P. H. (2008). The application of cluster analysis in library system. Proceedings of the IEEE International Symposium on Knowledge Acquisition and Modeling Workshop, 907-910.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297.
Maulik, U. and Bandyopadhyay, S. (2002). Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(12), 1650-1654.
Nasseri, M. and Zahraie, B. (2011). Application of simple clustering on space-time mapping of mean monthly rainfall pattern. International Journal of Climatology, 31(5), 732-741.
Niennattrakul, V., Keogh, E., and Ratanamahatana, C. A. (2010). Data editing techniques to allow the application of distance-based outlier detection to streams. Proceedings of the 10th International Conference on Data Mining (ICDM), 947-952.
Qin, X. P., Zheng, S. J., Huang, Y., and Deng, G. S. (2010). Improved k-means algorithm and application in customer segmentation. Proceedings of the Asia-Pacific Conference on Wearable Computing Systems (APWCS), 224-227.
Song, Y. C., Meng, H. D., and Zhang, Y. C. (2010). Clustering analysis and its applications. Proceedings of the 2nd International Conference on Geoscience and Remote Sensing (IITA-GRS), 1, 514-517.
Tan, P. N., Steinbach, M., and Kumar, V. (2005). Introduction to Data Mining. Boston: Pearson Addison Wesley.
Xu, R. and Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645-678.
Yildirim, A. A. and Ozdogan, C. (2011). Parallel WaveCluster: A linear scaling parallel clustering algorithm implementation with application to very large datasets. Journal of Parallel and Distributed Computing, 71(7), 955-962.
Zhang, Y., Meratnia, N., and Havinga, P. (2010). Outlier detection techniques for wireless sensor networks: A survey. IEEE Communications Surveys and Tutorials, 12(2), 159-170.
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
第一頁 上一頁 下一頁 最後一頁 top