I have a dataset with a column for Date that looks like this:
| Date | Another column |
| -------- | -------------- |
| 1.2019 | row1 |
| 2.2019 | row2 |
| 11.2018 | row3 |
| 8.2021 | row4 |
| 6.2021 | row5 |
The Date
column is interpreted as a float
dtype but in reality 1.2019
means month 1 - that is, january - of the year 2019. I changed it to string
type and it worked well, at least it seems so. But I want to plot this data against the total count of something, which is the column 2 of the dataset, but when I plot it:
the x-axis is not ordered. Well, why would it be? There is no ordered relationship between the string 1.2019
and 2.2019
: there is no way to know the first is january of 2019 and the second one is february. I thought of using regex, or even mapping 1.2019
to jan-2019
but the problem persists: strings with no date ordered relationship. I know there is the datetime
method but I don't know if this would help me.
How can I proceed? it is probably very easy, but I am stucked here!
CodePudding user response:
Convert to datetime with pandas.to_datetime
:
df['Date'] = pd.to_datetime(df['Date'].astype(str), format='%m.%Y')
or if you have a pandas version that refuses to convert if the day is missing:
pd.to_datetime('1.' df['Date'].astype(str), format='%d.%m.%Y')
output:
Date Another column
0 2019-01-01 row1
1 2019-02-01 row2
2 2018-11-01 row3
3 2021-08-01 row4
4 2021-06-01 row5