I have a situation where I have a series of data, with some values missing in the middle. Like below:
If you see the data, 2 is missing in the series.
I wish to plot a box plot or a violin plot where, I can have a placeholder for the 2 series, which would mean no data is present for it.
Right now I can plot by inserting 2 and substituting NaNs and it gives a plot like below:
Is there a better way to plot without manipulating the data, either by use of texts on X Axis or by just having a placeholder?
CodePudding user response:
annotating the missing categories:
ax = sns.boxplot(data=df, x='X', y='Y')
# positions of the categories in the X-axis
cats = {c: i for i,c in enumerate(df['X'].cat.categories)}
missing = set(df['X'].cat.categories)-set(df['X'])
# {2}
# mid-point of the Y-axis
y_pos = np.mean(ax.get_ylim())
for x in missing:
ax.annotate('N/A', (cats[x], y_pos), ha='center')
Output: