Clustering Feature

Part of a CF-Tree which is used in the BIRCH algorithm.

It is a 3D summarizing statistic about the current Clusters.

where is the number of data points in the cluster, is the linear sum of the points in the cluster and is the square sum of the points in the cluster.

This statistic allows to derive some useful information about a cluster.

Centroid: Radius: Diameter:

A nice way of merging two clusters is to just add the two CFs of two clusters.

Example