Efficient Computation of Fitness Function for Evolutionary Clustering
Abstract
Evolutionary algorithms (EAs) are random search heuristics which can solve various optimization problems. There are plenty of papers describing different approaches developed to apply evolutionary algorithms to the clustering problem, although none of them addressed the problem of fitness function computation. In clustering, many clustering validity indices exist that are designed to evaluate quality of resulting points partition. It is hard to use them as a fitness function due to their computational complexity. In this paper, we propose an efficient method for iterative computation of clustering validity indices which makes application of the EAs to this problem much more appropriate than it was before.References
Kaufman, L. and Rousseeuw, P. J. 2009. Finding groups in data: an introduction to cluster analysis, vol. 344. John Wiley & Sons, USA.
Jain, A. K. and Dubes, R. C. 1998. Algorithms for clustering data, vol. 6. Prentice Hall, Englewood Cliffs, NJ, USA.
Bigus, J. P. 1996. Data mining with neural networks: solving business problems from application development to decision support. McGraw-Hill, New York, USA.
Mecca, G., Raunich, S., and Pappalardo, A. 2007. A new algorithm for clustering search results. Data & Knowledge Engineering 62, 3, pp. 504-522.
Anderberg, M. R. 1973. Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks. Academic Press, USA.
Bonner, R. E. 1964. On some clustering techniques. IBM journal of research and development 8, 1, pp. 22-32.
Kleinberg, J. M. 2003. An impossibility theorem for clustering. In Advances in Neural Information Processing Systems 15 (NIPS 2002). MIT Press, pp. 463-470.
Arthur, D. and Vassilvitskii, S. 2007. k-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms.ACM, New Orleans, Louisiana, pp. 1027-1035.
Ester, M. et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining KDD 1996. AAAI Press, Portland, Oregon, pp. 226-231.
Chakrabarti, D., Kumar, R., and Tomkins, A. 2006. Evolutionary clustering. In Proceedings of the 12th ACM international conference on Knowledge discovery and data mining - SIGKDD 2006. ACM, Philadelphia, PA, pp. 554-560.
Arbelaitz, O. et al. 2013. An extensive comparative study of cluster validity indices. Pattern Recognition 46, 1, pp. 243-256.
Ng, A. Y., Jordan, M. I., and Weiss, Y. 2002. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems 14, 2, pp. 849-856.
Doer, B. and Doer, C. 2015. Optimal Parameter Choices Through Self-Adjustment: Applying the 1/5th. In Proceedings of the Genetic and Evolutionary Computation Conference - GECCO 2015. ACM, pp. 1335-1342.
Buzdalov, M. and Buzdalova, A. 2013. Adaptive Selection of Helper-Objectives for Test Case Generation. In 2013 IEEE Congress on Evolutionary Computation. IEEE, pp. 2245-2250. DOI: 10.1109/CEC.2013.6557836
Hruschka E. R. et al. 2009. A survey of evolutionary algorithms for clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 2, 39, pp. 133-155.
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.