I hope someone can help me with the following: I'm trying to convert my data to daily averages using:
df['timestamp'] = pd.to_datetime(df['Datum WSM-09'])
df_daily_avg = df.groupby(pd.Grouper(freq='D', key='timestamp')).mean()
df['Datum WSM-09'] looks like this:
0 6-3-2020 12:30
1 6-3-2020 12:40
2 6-3-2020 12:50
3 6-3-2020 13:00
4 6-3-2020 13:10
...
106785 18-3-2022 02:00
106786 18-3-2022 02:10
106787 18-3-2022 02:20
106788 18-3-2022 02:30
106789 18-3-2022 02:40
Name: Datum WSM-09, Length: 106790, dtype: object
However, when executing the first line the data under "timestamp" is inconsistent. The last rows displayed in the picture are correct. For the first ones, it should be 2020-03-06 12:30. The month and the day are switched around.
Many thanks
CodePudding user response:
Your source column is apparently object / text. The March 18th timestamps are unambiguous, as there's fewer than 18 months in the year. The ambiguous March 6th timestamps make the hair on the back of the black cat stand on end.
You neglected to specify a timestamp format, given that the source column is ambiguously formatted.
Please RTFM:
format : str, default None
The strftime to parse time, e.g. "%d/%m/%Y". Note that "%f" will parse all the way up to nanoseconds. See strftime documentation for more information on choices.
You tried offering a value of None
,
which is not a good match to your business needs.
I don't know what all of your input data looks like,
but perhaps %d-%m-%Y %H:%M
would better
match your needs.
CodePudding user response:
Try using the "dayfirst" option:
df['timestamp'] = pd.to_datetime(df['Datum WSM-09'], dayfirst=True)