(3.238.130.97) 您好!臺灣時間:2021/05/10 13:11
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:吳承學
研究生(外文):Cheng-Hshueh Wu
論文名稱:對分散與重疊程度都不同且大小相異之高斯分佈資料叢集的有效分群指標法
論文名稱(外文):An Effective Validity Index Method for Gaussian-distributed Clusters of different sizes with various degrees of Dispersion and Overlapping
指導教授:黃博惠黃博惠引用關係
口試委員:林芬蘭李朱慧
口試日期:2016-07-28
學位類別:碩士
校院名稱:國立中興大學
系所名稱:資訊科學與工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:中文
論文頁數:59
中文關鍵詞:分群指標法siibFCM分散度重疊程度群集密度不均群集大小差異與群集重疊
外文關鍵詞:Validity IndexsiibFCMDispersion measureOverlapping measure
相關次數:
  • 被引用被引用:0
  • 點閱點閱:88
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
分群指標法是用來評估分群品質,以及找出正確的分群數。本論文中,我們改進了分群指標法 因使用FCM分群法導致對含有大小不均衡群集的資料集分群結果不佳的缺點。新的分群指標法VDOsiib結合可容忍大小不均衡叢集的分群法 siibFCM以及叢集的分散度與重疊程度所構成。分散度代表了叢集分佈的緊密狀況,分散度越小代表叢集分佈越緊密;重疊度代表群與群之間的分離狀況,重疊程度越小代表群與群之間的距離越大。藉由這兩種度量的結合,我們可以獲得一個良好的分群指標。我們透過多組人造與真實資料集來驗證我們所提出的分群指標法之準確性與穩定性。結果顯示我們所提出的分群指標法可以有效的對於群集分散度不均衡、大小差異極大、與群集重疊的資料集預測正確的分群數目。

Cluster validity index method has two significant functions: assessing the quality of clustering and finding the correct number in cluster grouping. In this thesis, we propose a cluster validity index method, which intends to reduce the problem of a cluster validity index method VDO having little tolerance on estimating correct number of clusters for datasets comprising unbalance-populated clusters. Our new method uses the clustering method siibFCM that can tolerate datasets comprising unbalance-populated clusters along with dispersion and overlapping measures for computing the cluster validity index. The dispersion measure is used to estimate the overall data density of clusters in the dataset. Smaller dispersion means that data points are distributed more closely in all clusters. The overlap measure represents the overall separation between any pair of clusters in the dataset. Low degree of overlap means that clusters are well separated each other. By combining these two metrics, we obtain a good cluster validity index. We conducted several experiments to validate the effectiveness of our validity indexing method, including artificial datasets and public real datasets. Experimental results show that our validity indexing method can effectively and reliably estimate the correct/optimal number of clusters that widely differ in size, dispersion, and overlapping.

第一章 緒論 1
1.1  動機、背景與研究目的 1
1.2  論文架構 2
第二章 相關文獻 3
2.1  SIIBFCM(SIZE-INSENSITIVE INTEGRITY-BASE FUZZY C-MEANS) 3
2.2  現有分群指標法 5
2.2.1  Bezdek的分群指標法 5
2.2.2  Xie & Beni的分群指標法 6
2.2.3  Huang & Ho的分群指標法 6
2.2.4  Kim的分群指標法 7
2.2.5  Rezaee的分群指標法 8
2.2.6  Krista Rizman與Borut的分群指標法 9
2.2.7  Bhargavi & Gowda的分群指標法 9
2.2.8  Dispersion & Overlap分群指標法 10
2.2.9  總結 12
第三章 新有效分群指標法 13
3.1  分散度計算(THE DISPERSION MEASURE) 14
3.2  重疊度計算(THE OVERLAP MEASURE) 16
3.3  有效分群指標法 (THE CLUSTER VALIDITY INDEX ) 18
第四章 實驗結果與分析 21
4.1  群集大小相同的人造資料集 22
4.2  群集大小不同的人造資料集 30
4.2.1  高斯圓型分佈 30
4.2.2  不同分佈形狀 38
4.3  真實資料集 43
4.3.1  Abalone資料集 43
4.3.2  Breast Cancer資料集 44
4.3.3  Bupa資料集 45
4.3.4  Ecoli資料集 46
4.3.5  Glass資料集 47
4.3.6  Ionosphere資料集 48
4.3.7  Iris資料集 49
4.3.8  Seeds資料集 50
4.3.9  Vehicle資料集 51
4.3.10  Wine資料集 52
4.4  總結 53
4.4.1  不同分群指標法正確率 53
4.4.2  各分群法正確率 54
第五章 結論 56
參考文獻 57

[1]A. K. Jain, M. N. Murty, and P. J. Flynn, "Data clustering: a review," ACM Comput. Surv., vol. 31, pp. 264-323, 1999.
[2]J. C. Dunn, "A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters," Journal of Cybernetics, vol. 3, pp. 32-57, 1973.
[3]J. C. Dunn, "Well-Separated Clusters and Optimal Fuzzy Partitions," Journal of Cybernetics, vol. 4, pp. 95-104, 1974.
[4]J. C. Bezdek, "Numerical taxonomy with fuzzy sets," Journal of Mathematical Biology, vol. 1, pp. 57-71, 1974.
[5]J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms: Kluwer Academic Publishers, 1981.
[6]S. Theodoridis and K. Koutroumbas, "Chapter 14 - Clustering Algorithms III: Schemes Based on Function Optimization," in Pattern Recognition (Fourth Edition), ed Boston: Academic Press, 2009.
[7]P.-L. Lin, P.-W. Huang, C. H. Kuo, and Y. H. Lai, "A size-insensitive integrity-based fuzzy c-means method for data clustering," Pattern Recognition, vol. 47, pp. 2042-2056, 2014.
[8]F. Höppner, F. Klawonn, R. Kruse, and T. Runkler, "Fuzzy Cluster Analysis: Methods for Classifications," Data Analysis and Image Recognition, 1999.
[9]Y. Zhang, W. Wang, X. Zhang, and Y. Li, "A cluster validity index for fuzzy clustering," Information Sciences, vol. 178, pp. 1205-1218, 2008.

[10]K. Rizman Žalik, "Cluster validity index for estimation of fuzzy clusters of different sizes and densities," Pattern Recognition, vol. 43, pp. 3374-3390, 2010.
[11]J. C. Bezdek, "Cluster Validity with Fuzzy Sets," Journal of Cybernetics, vol. 3, pp. 58-73, 1973.
[12]J. C. Bezdek, "Mathematical models for systematics and taxonomy," Eighth International Conference on Numerical Taxonomy, pp. 143–165, 1975.
[13]X. L. Xie and G. Beni, "A validity measure for fuzzy clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, pp. 841-847, 1991.
[14]何秉軒, "基於熵度解決分群重疊之有效分群指標法," 國立中興大學 資訊科學與工程學系,碩士論文, 民國一百零三年.
[15]Y.-I. Kim, D.-W. Kim, D. Lee, and K. H. Lee, "A cluster validation index for GK cluster analysis based on relative degree of sharing," Information Sciences, vol. 168, pp. 225-242, 2004.
[16]B. Rezaee, "A cluster validity index for fuzzy clustering," Fuzzy Sets Syst., vol. 161, pp. 3014-3025, 2010.
[17]K. R. Žalik and B. Žalik, "Validity index for clusters of different sizes and densities," Pattern Recognition Letters, vol. 32, pp. 221-234, 2011.
[18]M. S. Bhargavi and S. D. Gowda, "A novel validity index with dynamic cut-off for determining true clusters," Pattern Recognition, vol. 48, pp. 3673-3687, 2015.
[19]P. L. Lin, P. W. Huang, and C. Y. Li, "A validity index method for clusters with different degrees of dispersion and overlap," in 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI), 2016, pp. 222-229.
[20]J. C. Noordam, W. H. A. M. van den Broek, and L. M. C. Buydens, "Multivariate image segmentation with cluster size insensitive Fuzzy C-means," Chemometrics and Intelligent Laboratory Systems, vol. 64, pp. 65-78, 2002.
[21]UCI Machine Learning Repositiory. Available: https://archive.ics.uci.edu/ml/
[22]KEEL. Available: http://sci2s.ugr.es/keel/dataset.php
[23]N. Nguyen and R. Caruana, "Consensus Clusterings," in Seventh IEEE International Conference on Data Mining (ICDM 2007), 2007, pp. 607-612.
[24]C. Zhong, X. Yue, Z. Zhang, and J. Lei, "A clustering ensemble: Two-level-refined co-association matrix with path-based transformation," Pattern Recognition, vol. 48, pp. 2699-2709, 2015.



QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔