I have a regular dataframe with a time series. How can I infer date fields from a specific value? I tried to apply pd.to_datetime
,but some dates start to display month and day incorrectly (swap them). That is, the following occurs: 07-02-2022
to 2022-07-02
, but it should be 2020-02-07
.
Here is a snippet of what I have:
date infected_in_day
0 07-02-2022 15442.0
1 06-02-2022 18856.0
2 05-02-2022 22444.0
...
214 02-07-2021 6893.0
229 16-06-2021 5782.0
235 11-12-2020 40.0
236 09-12-2020 42.0
237 08-12-2020 41.0
I need to filter data by date 16-06-2021
, that is, do not display everything that was before. Like this:
date infected_in_day
0 07-02-2022 15442.0
1 06-02-2022 18856.0
2 05-02-2022 22444.0
...
214 02-07-2021 6893.0
229 16-06-2021 5782.0
Is there any way to do this without using pd.to_datetime
? Or how to do it right?
CodePudding user response:
it seems that you have a problem with the datecolumn. If my assumtion is right, I would try to parse the data as I would expect to work with it something like this :
mydateparser = lambda x: pd.datetime.strptime(x, "%Y %m %d %H:%M:%S")
df = pd.read_csv("file.csv", sep='\t', parse_dates=['date'], date_parser=mydateparser)