Conference Proceedings

Approximate data mining in very large relational data

JC Bezdek, RJ Hathaway, C Leckie, R Kotagiri

Proceedings of the 17th Australasian Database Conference | Published : 2006

Abstract

In this paper we discuss eNERF, an extended version of non-Euclidean relational fuzzy c-means (NERFCM) for approximate clustering in very large (unloadable) relational data. The eNERF procedure consists of four parts: (i) selection of distinguished features by algorithm DF to be monitored during progressive sampling; (ii) progressively sampling a square N×N relation matrix RN by algorithm PS until an n×n sample relation Rn passes a goodness of fit test; (iii) Clustering Rn using algorithm LNERF; and (iv), extension of the LNERF results to RN-Rn by algorithm xNERF, which uses an iterative procedure based on LNERF to compute fuzzy membership values for all of the objects remaining after LNERF ..

View full abstract

Citation metrics