I have created a bar plot with percentages. However, since there's possibility of attrition I would like to include N, the number of observations or sample size (in brackets) as part of the bar labels. In other words, N should be the count of baseline and endline values.
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
import seaborn as sns
import pandas as pd
import numpy as np
data = {
'id': [1, 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 7, 8, 8, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15],
'survey': ['baseline', 'endline', 'baseline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', ],
'growth': [1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0]
}
df = pd.DataFrame(data)
sns.set_style('white')
ax = sns.barplot(data = df,
x = 'survey', y = 'growth',
estimator = lambda x: np.sum(x) / np.size(x) * 100, ci = None,
color = 'cornflowerblue')
ax.bar_label(ax.containers[0], fmt = '%.1f %%', fontsize = 20)
sns.despine(ax = ax, left = True)
ax.grid(True, axis = 'y')
ax.yaxis.set_major_formatter(PercentFormatter(100))
ax.set_xlabel('')
ax.set_ylabel('')
plt.tight_layout()
plt.show()
I will appreciate guidance on how to achieve this Thanks in advance!
CodePudding user response:
One approach could be as follows.
- First, use
On the italic type, see
this SO post
.