K Means Clustering
Begin by scaling the data:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_X = scaler.fit_transform(X)
Create model and train:
from sklearn.cluster import KMeans
model = KMeans(n_clusters=2)
cluster_labels = model.fit_predict(scaled_X)
The number of clusters is not always clear. We can use a knee method to find a good value:
ssd = []
for k in range(2,10):
model = KMeans(n_clusters=k)
model.fit(scaled_X)
ssd.append(model.inertia_)
Color quantization example:
import matplotlib.image as mpimg
image = mpimg.imread('palm_trees.jpg')
plt.imshow(image)
(h,w,c) = image.shape
image_2d = image.reshape(h*w,c)
model = KMeans(n_clusters=6)
labels = model.fit_predict(image_2d)
rgb_codes = model.cluster_centers_.round(0).astype(int)
new_image = rgb_codes[labels]
new_image = np.reshape(new_image,(h,w,c))
