Can anyone know how to see the data in a cluster after doing k-means clustering?-CodePudding

Is there any code to see or view the data in a cluster after doing k-means clustering in python, so that i can know which type of data clustered into which cluster and why.

help me with this ?

The cluster file is in .File extension, so I am unable to open it.

CodePudding user response：

It depends on how you are doing Kmeans... however... the attribute which shows the categorical cluster assignments (or "labels") are:

KMeans().fit().labels_

Code: (

Kmean.labels_

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

If you were to put the X, X1 and labels_ into a dataframe, it would look like this:

           X        X1  Labels
0  -1.918458 -1.918458       0
1  -1.378906 -1.378906       0
2  -0.888738 -0.888738       0
3  -1.924301 -1.924301       0
4  -0.619357 -0.619357       0
..       ...       ...     ...
95  1.893219  1.893219       1
96  2.820921  2.820921       1
97  2.454180  2.454180       1
98  1.599229  1.599229       1
99  2.270729  2.270729       1

[100 rows x 3 columns]

CodePudding user response：

Any predicted values or any values has features to make color maps in general can do this all of what you need is to make color equals to your color theme list and your labeler like this (the relabeler is just for make ground truth data colors like the predicted ones):

MyColorTheme = np.array(["darkgrey", "lightsalmon", "powderblue"])
MyRelabeler = np.choose(MyCluster.labels_, [2, 0, 1]).astype(np.int64)

plt.subplot(1, 2, 1)
plt.title("My Ground Truth Classification Module")
plt.scatter(x = MyDataFrame[["Petal Length"]], y = MyDataFrame[["Petal 
Width"]], c = MyColorTheme[MyData.target], s = 50)

plt.subplot(1, 2, 2)
plt.title("K clustring Classification Module")
plt.scatter(x = MyDataFrame[["Petal Length"]], y = MyDataFrame[["Petal 
Width"]], c = MyColorTheme[MyRelabeler], s = 50)

And the output will be like this

this is a iris SpectralClustering method from sklearn