I have following data:
ID Date Chemical Group
0 6 06-12-21 21 1
1 12 07-03-22 26 2
2 11 08-03-22 28 1
3 13 08-03-22 29 2
4 15 08-03-22 28 1
5 14 09-03-22 26 2
6 7 16-12-21 23 2
7 8 16-12-21 24 2
8 9 16-12-21 23 2
9 10 16-12-21 25 1
10 1 20-11-21 21 1
11 2 26-11-21 19 2
12 3 26-11-21 31 2
13 5 26-11-21 32 1
14 4 27-11-21 31 2
I set the format for date column, sort on it and print it:
maindf['Date'] = pd.to_datetime(maindf['Date'], format='%d-%m-%y')
maindf.sort_values('Date', inplace=True)
print(maindf)
Dates are properly recognised:
ID Date Chemical Group
10 1 2021-11-20 21 1
11 2 2021-11-26 19 2
12 3 2021-11-26 31 2
13 5 2021-11-26 32 1
14 4 2021-11-27 31 2
0 6 2021-12-06 21 1
6 7 2021-12-16 23 2
7 8 2021-12-16 24 2
8 9 2021-12-16 23 2
9 10 2021-12-16 25 1
1 12 2022-03-07 26 2
2 11 2022-03-08 28 1
3 13 2022-03-08 29 2
4 15 2022-03-08 28 1
5 14 2022-03-09 26 2
I draw boxplots:
import seaborn as sns
sns.boxplot(data=maindf, x='Date', y='Chemical', hue='Group')
plt.xticks(rotation=20)
plt.show()
I get following graph:
I see that although the dates are correct here, the times (hours, minutes, seconds) are also shown on x-axis with all these set as 0. I want to remove these so I set format for date on x-axis and then again display the graph:
sns.boxplot(data=maindf, x='Date', y='Chemical', hue='Group')
from matplotlib.dates import DateFormatter
dtFmt = DateFormatter('%d-%m-%y') # define the formatting
plt.gca().xaxis.set_major_formatter(dtFmt)
plt.xticks(rotation=20)
plt.show()
Now I get following graph:
I find that the dates on x-axis are all wrong. They start from 1-1-1970! Where is the problem and how can it be corrected. Thanks for your help.
CodePudding user response:
Due to the way seaborn handles the trailing zeros, think this is not getting set properly.
One way to fix it is to use strftime()
AFTER you have sorted the data to change the datetime back to string like this...
maindf['Date'] = pd.to_datetime(maindf['Date'])
maindf.sort_values('Date', inplace=True)
maindf['Date'] = maindf['Date'].dt.strftime('%d-%m-%Y') ## Change to format you need
sns.boxplot(data=maindf, x='Date', y='Chemical', hue='Group')
plt.xticks(rotation=40)
plt.show()
The other option is to use get & set xticklabels()
by removing the everything from T
onwards, so that you see the right format.
maindf['Date'] = pd.to_datetime(maindf['Date'])
maindf.sort_values('Date', inplace=True)
sns.boxplot(data=maindf, x='Date', y='Chemical', hue='Group')
plt.gca().set_xticklabels([date_text.get_text().split("T")[0] for date_text in plt.gca().get_xticklabels()])
plt.xticks(rotation=40)
plt.show()
Both would give you the below date format...