Context: I'm plotting accuracy values for several models and I'd like to show the accuracy values (from the list results) on each model (bar). How could I do that taking in consideration I'm sorting from highest accuracy to lowest? Thank you!
results = [lr_cv[1],svm_cv[1], decision_tree[1],score_log_reg_pca,nb_cv[1]] #colocar lista dos modelos feitos
names = ["Logistic Regression","SVM", "Decision Tree","Logistic Regression with PCA","Naive Bayes"] #colocar nomes para o grafico
df_new = pd.DataFrame(list(zip(names, results)), columns= ["Model","Accuracy"])
df_sorted = df_new.sort_values("Accuracy")
df_sorted.index=df_sorted.Model
plt.figure(figsize=(12,7))
ax = df_sorted.plot(kind="barh", facecolor="#AA0000",figsize=(15,10), fontsize=12)
ax.spines["bottom"].set_color("#CCCCCC")
ax.set_xlabel("Accuracy", fontsize=12)
ax.set_ylabel("Model",fontsize=12)
plt.title("Comparação de modelos para Classificação")
CodePudding user response:
If I understood correctly you want to add some text on the top of each bar that shows the actual value of accuracy achieved by each model. You can do it using the text()
method of ax()
(documented here). I have modified your code as follows (with some dummy values for the accuracy):
import pandas as pd
import matplotlib.pyplot as plt
results = [lr_cv[1],svm_cv[1], decision_tree[1],score_log_reg_pca,nb_cv[1]] #colocar lista dos modelos feitos
names = ["Logistic Regression","SVM", "Decision Tree","Logistic Regression with PCA","Naive Bayes"] #colocar nomes para o grafico
df_new = pd.DataFrame(list(zip(names, results)), columns= ["Model","Accuracy"])
df_sorted = df_new.sort_values("Accuracy")
df_sorted.index=df_sorted.Model
plt.figure(figsize=(12,7))
ax = df_sorted.plot(kind="barh", facecolor="#AA0000",figsize=(15,10), fontsize=12)
gap = 0.015 # Space between the text and the end of the bar
# You have to call ax.text() for each bar
# They are already sorted and you need the index of the bar
for i, v in enumerate(df_sorted.Accuracy):
ax.text(v gap, i, str(v), color='blue') # Place the text at x=v gap and y= idx
ax.spines["bottom"].set_color("#CCCCCC")
ax.set_xlabel("Accuracy", fontsize=12)
ax.set_ylabel("Model",fontsize=12)
plt.title("Comparação de modelos para Classificação")
Another option is to adapt the code and note use the plot()
method of pandas, and is Axes.bar_label() as follows:
import pandas as pd
import matplotlib.pyplot as plt
results = [lr_cv[1],svm_cv[1], decision_tree[1],score_log_reg_pca,nb_cv[1]] #colocar lista dos modelos feitos
names = ["Logistic Regression","SVM", "Decision Tree","Logistic Regression with PCA","Naive Bayes"] #colocar nomes para o grafico
df_new = pd.DataFrame(list(zip(names, results)), columns= ["Model","Accuracy"])
df_sorted = df_new.sort_values("Accuracy")
df_sorted.index=df_sorted.Model
plt.figure(figsize=(12,7))
fig, ax = plt.subplots() # get ax handle
bars = plt.barh(df_sorted.Model, df_sorted.Accuracy) # Plot barh
ax.bar_label(bars) # Set the labels
ax.spines["bottom"].set_color("#CCCCCC")
ax.set_xlabel("Accuracy", fontsize=12)
ax.set_ylabel("Model",fontsize=12)
plt.title("Comparação de modelos para Classificação")