Conference Proceedings

Scalable single linkage hierarchical clustering for big data

TC Havens, JC Bezdek, M Palaniswami

Proceedings of the 2013 IEEE 8th International Conference on Intelligent Sensors Sensor Networks and Information Processing Sensing the Future Issnip 2013 | Published : 2013

Abstract

Personal computing technologies are everywhere; hence, there are an abundance of staggeringly large data sets - the Library of Congress has stored over 160 terabytes of web data and it is estimated that Facebook alone logs nearly a petabyte of data per day. Thus, there is a pertinent need for systems by which one can elucidate the similarity and dissimilarity among and between groups in these big data sets. Clustering is one way to find these groups. In this paper, we extend the scalable Visual Assessment of Tendency (sVAT) algorithm to return single-linkage partitions of big data sets. The sVAT algorithm is designed to provide visual evidence of the number of clusters in unloadable (big) da..

View full abstract

University of Melbourne Researchers