K-means Algorithm for unsupervised classification

Overview of K-Means Clustering

We have implemented k-means clustering in both software and reconfigurable hardware. These pages give an overview of the k-means algorithms and some of the decisions to be made for an implementation.

Results from our C implementation. Click here for pictures !

What is k-means clustering?

"There are several variants of the k-means clustering algorithm, but most variants involve an iterative scheme that operates over a fixed number of clusters, while attempting to satisfy the following properties: [1]

Each class has a center which is the mean position of all the samples in that class.
Each sample is in the class whose center it is closest to."

We will use the following Notation

The basic k-means algorithm consists of the following steps:


  Initialize 

  Loop until  termination condition is met:

     1.  For each pixel in the image, assign that pixel to a class
         such that the  distance  from 
         this pixel to the center of that class is minimized.

     2.  For each class, recalculate the means of the class based on
         the pixels that belong to that class.

     end loop;

Traditionally these steps are done iteratively (although you can do on-the-fly calculations: see variants).

They cannot be pipelined because you must complete step 1 before step 2 and vice versa.

Parameters and options for the k-means algorithm:

Number of classes
Initialization
Distance Measure
Termination
Quality
Parallelism
Distance
What to do with dead classes
Variants : one pass
on-the-fly calculation of means
Our hardware implementation
References on k-means

Results from our C implementation. Click here for pictures !

This page is maintained by Prof. Miriam Leeser
Last updated September 16, 1999.
Email: mel@ece.neu.edu