Home > Enterprise >  Color problem while trying to plot KMeans clusters
Color problem while trying to plot KMeans clusters

Time:04-14

I am trying to plot my data with a colours related to the clusters as shown below:

enter image description here

However, when I write this code it shows as below:

model = KMeans(n_clusters = 2)
model.fit(projected_data)
labels = model.predict(projected_data)
plt.scatter(projected_data[0],projected_data[1],c='red')
plt.show()

enter image description here

Looking online, I found changing the c='red' to c=labels would fix the problem, but when ever I change the code to plt.scatter(projected_data[0],projected_data[1],c=labels) it gives me this error:

'c' argument has 2 elements, which is inconsistent with 'x' and 'y' with size 6.

How can I make the colours change dynamically (Not having to type an array of strings 6 times like c=['red','blue'...]) to get a colour for each cluster?

In case you need it to test it yourself, projected_data variable equals

[[ 4  4 -6  3  1 -5]
 [ 0 -3  2 -1  5 -4]]

CodePudding user response:

It's expecting a list. You can see an example here where we use kmeans.labels_ as the color parameter. You could change the color palette or map the cluster labels to strings of the color you want.

from sklearn.cluster import KMeans
import numpy as np
X = np.array([[1, 2], [1, 4], [1, 0],
              [10, 2], [10, 4], [10, 0]])
kmeans = KMeans(n_clusters=2, random_state=0).fit(X)

x,y = X.T

print(kmeans.labels_)
plt.scatter(x,y, c=kmeans.labels_);

Output

[1 1 1 0 0 0]

enter image description here

  • Related