Image Compression with K Means Clustering

Josh Price
3 min readMar 13, 2021

First reaction to K means: “That’s it? That’s so much simpler than I’d expected!”

At the core, it’s two steps.

  • Assign data points to nearest centroid
  • Move the centroid to the middle of the data points.

And repeat. Elegantly simple!

The course materials (Stanford/Coursera Machine Learning) have us apply K Means as a method of image compression. Specifically, to cluster colors into K groups (resulting in an image using only K colors). I chose to do so with an image of Tom Waits.

Image credit unknown, sincere apologies

By grouping the colors represented in our image, we reduce the amount of data necessary to represent the image. This is inherently a “lossy” compression method, so image quality will be lost with file size.

Code will not be provided, so as to remain on the good side of the Coursera Honor Code.

With my algorithm set to run 20 iterations and a K of 16, we wound up with a well grouped compressed image, with no obvious signs of getting stuck at local minima.

Comparison of original image with a K=16 compressed image

Obviously, the lower with go with the K value, the more compressed our image will be, resulting in poorer quality. Let’s try a K value of 64.

Comparison of original image with a K=64 compressed image

It’s clear that there’s far less visible color distortion, however we’re left with a larger file size as a result. There is a noticeable lack of red (look at the cheeks) so it’s possible that in this run, we may have run into a local optima issue. This could be resolved fairly easily with an optimization loop that attempts various runs through the clustering process with random initial centroid points (choosing the one with the lowest final cost), but as it would be extremely time consuming and ultimately unnecessary for the learning process, I’ll be skipping that.

What I Learned

The actual mechanics behind K Means clustering is indeed much simpler than I would have expected. Unsupervised learning was something I had come into almost completely fresh and I have been happy to see that so far it’s all clear and easy to comprehend.

What’s Left to Learn

In this specific scenario, K Means seems an extremely inefficient way to compress an image compared to traditional methods. I’m certainly no image compression expert, but from the little I know, it seems that using such an approach for image compression in particular would require specific reasons to do so. While I can see countless applications for clustering and unsupervised learning in a general sense, I may want to research if this particular example was just an example or if there is an efficient use case for this approach for images.

--

--