Series: K-Means Clustering (MapReduce | BSP)

21st May 2011·1 min read

Hi all,

I was a bit busy last time so I hadn’t that much time to blog.
Several days ago after PageRank I had an idea to implement k-means clustering with Apache Hama and BSP.
Now I’ve decided to first implement a MapReduce implementation of it, since this is very simple: Reading the centers in setup’s method and calculate the distance from each vector to the centers in map phase. In the reduce phase we are going to calculate new cluster centers.

This is very very straightforward. So this will be a series about a MapReduce implementation and a better one with BSP.

‘till then!

Greetzz


Thomas Jungblut

I'm Thomas Jungblut - welcome to my personal blog. Here you'll find a lot of posts around all the things I'm interested in writing about. Big Data, Bulk Synchronous Parallel, MapReduce, Machine Learning, Clustering, Graph Theory, Natural Language Processing, Computer Science and Open Source in general.

© Thomas Jungblut 2024. Built with Gatsby