Scalable Rough C-Means clustering using Firefly algorithm

Balakrushna Tripathy, Abhilash Namdev

Abstract


Our main interest is in dealing with the disadvantages of old clustering algorithms and coming up with a method that can generate clusters which produce optimal results when compared with previous approaches. Firstly, our focus is on analyzing the limitations of most widely used clustering algorithm. Here we choose K means clustering algorithm for the purpose. To provide the optimal results from the initial stage of algorithm we use firefly algorithm. The bioinspired algorithm that generates optimal minimum or maximum values based on certain parameters. To avoid the strictness on the boundary area in k means algorithm, we choose Rough C means algorithm, which provide some flexibility during the clustering process. Our proposed method provides most efficiency both in terms of time and space.  We used efficient data structures which help us to avoid waste of memory while computation and also our algorithm utilizes maximum resources of the machine to make the execution rate as fast as possible.


Keywords


Clustering, datasets, firefly, threads, k-means, rough c mean

Full Text:

PDF

References


Aamir, S. Akhtar, A. Javed, A. and Carpenter, B. (2014). Teaching Parallel Programming Using Java. Workshop on Education for High Performance Computing, pp.56-63.

Chen, K Y. Chang, J M. and Hou, T W. (2011). Multithreading in Java: Performance and Scalability on Multicore Systems. IEEE Transactions on Computers, pp.1521-1534.

Javier, F G,. (2012). Java 7 Concurrency Cookbook. Packet publications.

Karimov, J. Ozbayoglu, M. and Dogdu, E. (2015). k-means Performance Improvements with Centroid Calculation Heuristics both for Serial and Parallel environments. IEEE International Congress on Big Data, pp.444-452

Lohrer, M. F. (2013). A Comparison Between the Firefly Algorithm and Particle Swarm Optimization. Graduate Thesis, submitted to Oakland University

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. 5th Berkley Symposium, pp.281-297.

Mathew, J. and Vijayakumar, R. (2014a). Scalable parallel clustering using modified Firefly algorithm. IOSR Journal of Computer Engineering. Volume 16. Issue 6, Ver. I, pp.14-24.

Mathew, J. Vijayakumar, R. (2014b). Scalable Parallel Clustering Approach for Large Data using Possibilistic Fuzzy C-Means Algorithm. International Journal of Computer Applications. Volume 103. Number 9.

Mathew, J. and Vijayakumar, R. (2014). Scalable Parallel Clustering Approach for Large Data Using Parallel K Means and Firefly Algorithms. International Conference on High Performance Computing and Applications, pp.1-8.

MO, Y B. and MA, Y Z. (2013). Optimal Choice of Parameters for Firefly Algorithm, In: Fourth International Conference on digital manufacturing and automation (ICDMA). pp. 887-892.

Raja, M S M. Manic, K S. and Rajinikanth, V. (2013). Firefly Algorithm with Various Randomization Parameters: An Analysis. Springer International Publishing Switzerland, pp.110-121.

Scheldt, Herbert. (2011). Java - the Complete Reference. 8th edition. Mc-Graw Hill Companies. USA.

Swamy P, Raghuwanshi, M. and Gholghate, A. (2015). An Improved approach for K-Means using Parallel Processing. Proceedings of the 2015 International Conference on Computing, Communication, Control and Automation. pp.358-361.

Theodoratos, S. and Koutroumbas, K. Pattern Recognition, (2008). 4th edition, Academic Press.

.Xu, Y. Qu, W. Li, Z. Min, G. Li, K. and Liu, Z. (2014). Efficient k-Means++ Approximation with MapReduce. IEEE Transactions on Parallel and Distributed Systems. pp. 3135 – 3144..

Yang, X.S. (2010). Nature Inspired Metaheuristic Algorithms. Luniver Press. BA11 6TT. UK.

Zhang, Y. Xiong, Z. Mao, J. and L, Ou. (2006). The Study of Parallel K-Means Algorithm. Proceedings of the 6th World Congress on Intelligent Control and Automation. Dalian, China, pp. 5868 – 5871

Singh, Chaitanya. (2016). Beginners Book for multithreading. [Online 1] available at: http://beginnersbook.com/ 2013/03/multithreading-in-java/.

Oracle Java Documentation. (2015). Fork-join framework. [Online 2] available at: https://docs.oracle.com/javase/tutorial/ essential/concurrency/ forkjoin.html.

McCullock, John.(2012). Implementation of K-means clustering. [Online 3] available at: http://mnemstudio.org/ clustering-k-means-introduction.htm.


Refbacks

  • There are currently no refbacks.


ISSN: 1694-2507 (Print)

ISSN: 1694-2108 (Online)