I am trying to use SOM to cluster my data, firstly I want to get the best K. but I need a line or something to detect the best K on the plot. I tried to use KElbowVisualizer() but it always diplay an error:
YellowbrickTypeError: The supplied model is not a clustering estimator; try a classifier or regression score visualizer instead!
and here is my code:
from sklearn_som.som import SOM
som = SOM(m = 1, n = i, dim = data.shape[1])
visualizer = KElbowVisualizer(som, k = (1,11))
visualizer.fit(data)
visualizer.show()
I also used the ordinary Plot() from matplotlib, but I cannot see the Best k, my code:
inertia = []
for i in range (1,31):
som = SOM(m = 1, n = i, dim = data.shape[1])
som.fit_predict(data)
inertia.append(som.inertia_)
plt.plot(range(1,31), inertia)
plt.title('elbow method for SOM')
plt.xlabel('number of clusters')
plt.ylabel('WCSS')
plt.show()
that's the plot I got from Plot()
So, please how can I do that either in the plot or by using a code?
CodePudding user response:
I have just found the best solution to my question. I decided to post it here. may be someone else needs it.
from kneed import KneeLocator
The solution is here just use this library with the matplotlib library
the implementation is like that:
inertia = []
for i in range (1,31):
som = SOM(m = 1, n = i, dim = x_lda_train.shape[1])
som.fit_predict(x_lda_train)
inertia.append(som.inertia_)
# identify the knee by using the kneelocator function
kneeloc1 = KneeLocator(range(1,11), wcss, curve='convex', direction='decreasing')
plt.plot(range(1,31), inertia)
plt.title('elbow method for SOM')
plt.xlabel('number of clusters')
plt.ylabel('WCSS')
# print it by using the vlines
plt.vlines(kneeloc1.knee, plt.ylim()[0], plt.ylim()[1], linestyles='dashed')
plt.show()
# you can see this also as just a number by printing it
print(kneeloc1.knee)
For more info you can see the documentation: visit https://kneed.readthedocs.io/en/stable/parameters.html