Home > Blockchain >  Restrict the output of posts from a specific date (pandas)
Restrict the output of posts from a specific date (pandas)

Time:02-11

I have a regular dataframe with a time series. How can I infer date fields from a specific value? I tried to apply pd.to_datetime ,but some dates start to display month and day incorrectly (swap them). That is, the following occurs: 07-02-2022 to 2022-07-02, but it should be 2020-02-07.

Here is a snippet of what I have:

      date          infected_in_day
0     07-02-2022    15442.0
1     06-02-2022    18856.0
2     05-02-2022    22444.0
...
214   02-07-2021    6893.0
229   16-06-2021    5782.0
235   11-12-2020    40.0
236   09-12-2020    42.0
237   08-12-2020    41.0

I need to filter data by date 16-06-2021, that is, do not display everything that was before. Like this:

      date          infected_in_day
0     07-02-2022    15442.0
1     06-02-2022    18856.0
2     05-02-2022    22444.0
...
214   02-07-2021    6893.0
229   16-06-2021    5782.0

Is there any way to do this without using pd.to_datetime? Or how to do it right?

CodePudding user response:

it seems that you have a problem with the datecolumn. If my assumtion is right, I would try to parse the data as I would expect to work with it something like this :

mydateparser = lambda x: pd.datetime.strptime(x, "%Y %m %d %H:%M:%S")
df = pd.read_csv("file.csv", sep='\t', parse_dates=['date'], date_parser=mydateparser)
  • Related