Conference Proceedings

RT-DBSCAN: Real-time parallel clustering of spatio-temporal data using spark-streaming

Y Gong, RO Sinnott, P Rimba

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Springer Nature | Published : 2018

Abstract

© Springer International Publishing AG, part of Springer Nature 2018. Clustering algorithms are essential for many big data applications involving point-based data, e.g. user generated social media data from platforms such as Twitter. One of the most common approaches for clustering is DBSCAN. However, DBSCAN has numerous limitations. The algorithm itself is based on traversing the whole dataset and identifying the neighbours around each point. This approach is not suitable when data is created and streamed in real-time however. Instead a more dynamic approach is required. This paper presents a new approach, RT-DBSCAN, that supports real-time clustering of data based on continuous cluster ch..

View full abstract