The K-means algorithm, which partitions observations into different clusters, is often used for extremely large dataset analysis in data mining. A big issue with K-means execution is the large amount of time required, especially when applied to large data sets. Software parallelization methods, such as OpenMP, MPI, CUDA or MPI-CUDA, can only achieve limited acceleration over the sequential code, because the execution paradigm of the K-means algorithm is better suited to hardware pipelining. In this research, the algorithm is mapped to Field Programmable Gate Array (FPGA) hardware using an RTL implementation that overlaps the assignment and update steps and achieves a throughput balance between these two steps. The implementations are developed on two platforms, Gidel and Amazon Web Services F1 instance, which both provide hardware infrastructure and software development toolkits. The results of this research show that a hardware implementation could achieve at least two orders of magnitude speed up and is scalable to extremely large data sets.
- Professor Miriam Leeser (Advisor)
- Professor Mieczyslaw M. Kokar
- Professor Ningfang Mi