Hi all,
I was a bit busy last time so I hadn’t that much time to blog.
Several days ago after PageRank I had an idea to implement k-means clustering with Apache Hama and BSP.
Now I’ve decided to first implement a MapReduce implementation of it, since this is very simple: Reading the centers in setup’s method and calculate the distance from each vector to the centers in map phase. In the reduce phase we are going to calculate new cluster centers.
This is very very straightforward. So this will be a series about a MapReduce implementation and a better one with BSP.
‘till then!
Greetzz