跳到主要內容

臺灣博碩士論文加值系統

(44.200.77.92) 您好!臺灣時間:2024/03/01 09:21
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:戴杰安
研究生(外文):Chiech-an Tai
論文名稱:以差分進化演算法解決自動資料分群問題
論文名稱(外文):An Automatic Data Clustering Algorithm based on Differential Evolution
指導教授:江明朝
指導教授(外文):Ming-Chao Chiang
學位類別:碩士
校院名稱:國立中山大學
系所名稱:資訊工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:102
語文別:英文
論文頁數:63
中文關鍵詞:資料分群自動分群差分進化演算法直方圖分析高維度資料集
外文關鍵詞:automatic clusteringdata clusteringhigh-dimensional datasethistogram analysisdifferential evolution
相關次數:
  • 被引用被引用:0
  • 點閱點閱:204
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
作為一個傳統的優化問題,分群至今仍在理論研究和實務問題上扮演著至關重要的角色。雖然目前已有許多成功的分群演算法已經被提出,但大多數的方法(不是所有)在分群過程開始前,需要事先給予資料的分群數當作參數。本文中提出一個以差分進化演算法為基礎的新型演算法,用來自動決定資料的分群數,以解決不知道資料分群數的情況。此演算法,稱為增強版本之自動資料分群差分進化演算法,其中包含兩種技術:一個是新穎基於直方圖的分析技術用在找尋大致接近的分群數,和一個用來微調自動分群結果的啟發式搜尋演算法。實驗結果顯示,本篇所提之方法和其他現有自動分群演算法比較起來,不僅可以自動確定資料集所適合的分群數,也能快速的提供正確群數,甚至在高維度資料集上也可以得到很好的結果。
As one of the traditional optimization problems, clustering still plays a vital role for the re-searches both theoretically and practically nowadays. Although many successful clustering algorithms have been presented, most (if not all) need to be given the number of clusters before the clustering procedure is invoked. A novel differential evolution based clustering algorithm is presented in this paper to solve the problem of automatically determining the number of clusters. The proposed algorithm, called enhanced differential evolution for automatic cluster-ing (EDEAC), leverages the strengths of two technologies: a novel histogram-based analysis technique for finding the approximate number of clusters and a heuristic search algorithm for
fine-tuning the automatic clustering results. The experimental results show that the proposed algorithm can not only determine the approximate number of clusters automatically, but it can also provide an accurate number of clusters rapidly even for high dimensional datasets com-pared to other existing automatic clustering algorithms.
論文審定書 i
誌謝 iii
摘要 iv
Abstract v
List of Figures ix
List of Tables x
Chapter 1 Introudction 1
1.1 Motivation 2
1.2 Contributions of the Thesis 2
1.3 Organization of the Thesis 3
Chapter 2 Related Works 4
2.1 Data Clustering Problem 4
2.1.1 Partitional Algorithm 5
2.1.2 Hierarchical Algorithm 6
2.2 Evolutionary Computing 9
2.2.1 Evolutionary Algorithm 9
2.2.2 Genetic Algorithm for Clustering 10
2.2.3 Differential Evolution for Clustering 12
2.2.3.1 Mutation 12
2.2.3.2 Recombination 14
2.2.3.3 Selection 14
2.3 Automatic Clustering Algorithm 15
2.3.1 GA-based Automatic Clustering Algorithm 15
2.3.2 DE-based Automatic Clustering Algorithm 16
2.3.3 Other Automatic Clustering Algorithm 17
2.4 Cluster Validity Index 18
2.4.1 Notations 19
2.4.2 Dunn’s Index 19
2.4.3 Davies-Bouldin Index 20
2.4.4 CS Index 20
2.4.5 Discussion 21
2.5 Summary 21
Chapter 3 The Proposed Algorithm 23
3.1 The Concept 23
3.2 The Proposed Algorithm 24
3.2.1 Data Normalization 25
3.2.2 Modified RSM 25
3.2.3 ACDE 28
3.2.3.1 Elimination of Erroneous Chromosomes 29
3.2.3.2 Computation of Fitness 29
3.3 Example 30
Chapter 4 Simulation Results 34
4.1 Experiments Settings 34
4.1.1 Simulation Environment 34
4.1.2 Parameter Settings 34
4.1.3 Datasets 35
4.2 Simulation Results 35
4.2.1 Evaluation by Number of Clusters 36
4.2.2 Evaluation by Convergence Speed 38
4.3 Overlapping Analysis 41
4.3.1 Discussion 42
Chapter 5 Conclusion and Future Works 44
5.1 Conclusion 44
5.2 Future Works 44
Bibliography 46
[1] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264–323, 1999.
[2] G. B. Coleman and H. C. Andrews, “Image segmentation by clustering,” Proceedings of the IEEE, vol. 67, no. 5, pp. 773–785, 1979.
[3] K. Fukunaga, Introduction to statistical pattern recognition. Academic Press, 1990.
[4] A. Abraham, S. Das, and A. Konar, “Document clustering using differential evolution,” in Proceedings of IEEE Congress Evolutionary Computation, pp. 1784–1791, 2006.
[5] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297, 1967.
[6] D. E. Goldberg, “Genetic algorithms in search, optimization, and machine learning,” Machine Learning, vol. 3, no. 2, pp. 2–3, 1989.
[7] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948, 1995.
[8] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: Optimization by a colony of cooperating agents,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 26, no. 1, pp. 29–41, 1996.
[9] R. Storn and K. Price, “Differential evolution : A simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization, vol. 11, pp. 341– 359, 1997.
[10] S. Das and P. N. Suganthan, “Differential evolution: A survey of the state-of-the-art,” IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 4–31, 2011.
[11] M. Omran, A. Salman, and A. Engelbrecht, “Dynamic clustering using particle swarm optimization with application in unsupervised image classification,” in Proceedings of the International Conference on Computational Intelligence, pp. 199–204, 2005.
[12] S. Bandyopadhyay and U. Maulik, “Genetic clustering for automatic evolution of clusters and application to image classification,” Pattern Recognition, vol. 35, no. 6, pp. 1197– 1208, 2002.
[13] S. Das and A. Konar, “Automatic image pixel clustering with an improved differential evolution,” Applied Soft Computing, vol. 9, no. 1, pp. 226–236, 2009.
[14] K. S. Tan, N. A. M. Isa, and W. H. Lim, “Color image segmentation using adaptive unsupervised clustering approach,” Applied Soft Computing, vol. 13, no. 4, 2012.
[15] S. Das, A. Abraham, and A. Konar, “Automatic clustering using an improved differential evolution algorithm,” IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, vol. 38, no. 1, pp. 218–237, 2008.
[16] P. Berkhin, “A survey of clustering data mining techniques,” in Grouping Multidimensional Data, pp. 25–71, Springer, 2006.
[17] J. Han, M. Kamber, and J. Pei, Data mining: Concepts and techniques. Morgan kaufmann, 2006.
[18] A. K. Jain and R. C. Dubes, Algorithms for clustering data. Prentice-Hall Incorporated, 1988.
[19] M. R. Anderberg, Cluster analysis for applications. Academic Press, 1973.
[20] R. Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 645–678, 2005.
[21] L. Kaufman and P. J. Rousseeuw, Finding groups in data: An introduction to cluster analysis. John Wiley &; Sons, 2009.
[22] S. B. Kotsiantis, “Supervised machine learning: A review of classification techniques,” Informatica (Slovenia), vol. 31, no. 3, pp. 249–268, 2007.
[23] L. V. Fausett, Fundamentals of neural networks: Architectures, algorithms, and applications. Prentice-Hall Incorporated, 1994.
[24] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[25] S. Kirkpatrick, D. G. Jr., and M. P. Vecchi, “Optimization by simmulated annealing,” Science, vol. 220, no. 4598, pp. 671–680, 1983.
[26] F. Glover and M. Laguna, Tabu search. Springer, 1997.
[27] J. H. Ward Jr, “Hierarchical grouping to optimize an objective function,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.
[28] E. C. Grimm, “Coniss: A Fortran 77 program for stratigraphically constrained cluster analysis by the method of incremental sum of squares,” Computers &; Geosciences, vol. 13, no. 1, pp. 13–35, 1987.
[29] C. Ding and X. He, “Cluster merging and splitting in hierarchical clustering algorithms,” in Proceedings of the IEEE International Conference on Data Mining, pp. 139–146, 2002.
[30] R. A. Stegwee, Division for conquest: Decision support for information architecture specification. PhD thesis, University of Groningen, Groningen,The Netherlands, 1992.
[31] S. C. Johnson, “Hierarchical clustering schemes,” Psychometrika, vol. 32, no. 3, pp. 241– 254, 1967.
[32] F. Murtagh, “A survey of recent advances in hierarchical clustering algorithms,” The Computer Journal, vol. 26, no. 4, pp. 354–359, 1983.
[33] T. Sørensen, “A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on danish commons,” Biologiske Skrifter, vol. 5, no. 4, pp. 1–34, 1948.
[34] R. Sokal and C. Michener, A statistical method for evaluating systematic relationships. University of Kansas, 1958.
[35] T. B ̈ack and H.-P. Schwefel, “An overview of evolutionary algorithms for parameter optimization,” Evolutionary Computation, vol. 1, no. 1, pp. 1–23, 1993.
[36] G. B. Fogel and D. W. Corne, Evolutionary computation in bioinformatics. Morgan Kaufmann, 2002.
[37] D. Dasgupta and Z. Michalewicz, Evolutionary algorithms in engineering applications. Springer, 1997.
[38] R. Kicinger, T. Arciszewski, and K. D. Jong, “Evolutionary computation and structural design: A survey of the state-of-the-art,” Computers &; Structures, vol. 83, no. 23, pp. 1943– 1978, 2005.
[39] A. E. Eiben and J. E. Smith, Introduction to evolutionary computing. Springer, 2010.
[40] T. B ̈ack, Evolutionary algorithms in theory and practice: Evolution strategies, evolutionary programming, genetic algorithms. Oxford University Press, 1996.
[41] T. B ̈ack, F. Hoffmeister, and H.-P. Schwefel, “A survey of evolution strategies,” in Proceedings of the International Conference on Genetic Algorithms, pp. 2–9, 1991.
[42] L. Fogel, A. Owens, and M. Walsh, Artificial intelligence through simulated evolution. John Wiley &; Sons, 1966.
[43] J. H. Holland, Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence. University of Michigan Press, 1975.
[44] S. F. Smith, A learning system based on genetic adaptive algorithms. PhD thesis, University of Pittsburgh, Pittsburgh, PA, USA, 1980.
[45] U. Maulik and S. Bandyopadhyay, “Genetic algorithm-based clustering technique,” Pattern Recognition, vol. 33, no. 9, pp. 1455–1465, 2000.
[46] S. Paterlini and T. Krink, “High performance clustering with differential evolution,” in Proceedings of the Congress on Evolutionary Computation, vol. 2, pp. 2004–2011, 2004.
[47] S. Paterlini and T. Krink, “Differential evolution and particle swarm optimisation in partitional clustering,” Computational Statistics &; Data Analysis, vol. 50, pp. 1220–1247, 2006.
[48] S. H. A, G. Hyd, and A. P. India, “Data clustering using almost parameter free differential evolution technique,” International Journal of Computer Applications, vol. 8, no. 13, pp. 1–7, 2010.
[49] V. V. Raghavan and K. Birchard, “A clustering strategy based on a formalism of the reproductive process in natural systems,” in Proceedings of the International Conference on Information Storage and Retrieval: Information Implications into the Eighties, pp. 10–22, 1979.
[50] S. Das, A. Konar, and U. K. Chakraborty, “Two improved differential evolution schemes for faster global search,” in Proceedings of the Conference on Genetic and Evolutionary Computation, pp. 991–998, 2005.
[51] M. Steinbach, G. Karypis, and V. Kumar, “A comparison of document clustering techniques,” in Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 109–110, 2000.
[52] D. Pelleg and A. W. Moore, “X-means: Extending k-means with efficient estimation of the number of clusters,” in Proceedings of the International Conference on Machine Learning, pp. 727–734, 2000.
[53] J. C. Dunn, “A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters,” Journal of Cybernetics, vol. 3, pp. 32–57, 1973.
[54] J. C. Dunn, “Well-separated clusters and optimal fuzzy partitions,” Journal of Cybernetics, vol. 4, no. 1, pp. 95–104, 1974.
[55] D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224–227, 1979.
[56] C. Chou, M. Su, and E. Lai, “A new cluster validity measure and its application to image compression,” Pattern Analysis and Applications, vol. 7, pp. 205–220, 2004.
[57] A. K. Jain, R. P. W. Duin, and J. Mao, “Statistical pattern recognition: A review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4–37, 2000.
[58] M. K. Pakhira, S. Bandyopadhyay, and U. Maulik, “Validity index for crisp and fuzzy clusters,” Pattern Recognition, vol. 37, no. 3, pp. 487–501, 2004.
[59] K. Bache and M. Lichman, “UCI machine learning repository,” Accessed on 2013.
[60] Speech and Image Processing Unit, School of Computing, University of Eastern Finland, “Clustering datasets,” Accessed on 2013.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top