Home > Blockchain >  How to take just one max date from a list of dates in one column in python
How to take just one max date from a list of dates in one column in python

Time:11-18

I have a data frame that has one column which might have more than one date values. I only want to keep max value among the list. FYI: row 5 has 2 dates in it and I only want one date(no time)

Data:

date    | calls | pres | pat
2021-08-17 13:20:23 | 1 | 1 | 1
2021-08-13 13:47:16 | 0 | 1 | 1 
2021-08-16 11:14:08 | 1 | 1 | 1
2021-08-10 20:35:33 | 1 | 1 | 1
2021-08-31 14:32:13 | 2021-08-31 14:30:40 | 0 | 2 | 1

I am trying this:

y = df_pvt['date']
for i in range(len(y)):
    try: 
        pick_date_i = y[i] = pd.to_datetime(y[i]) 
        pick_date_i = pick_date_i.max()
    except KeyError as e: 
        pass
    pick_date_i = pick_date_i.max()

But its not working, how can I correct it? I am expecting only one max date in one row.

CodePudding user response:

Idea is split values by |, remove possible spaces and convert to datetetimes, last use max:

print (df_pvt)
                                        date  calls  pres  pat
0                        2021-08-17 13:20:23      1     1    1
1                        2021-08-13 13:47:16      0     1    1
2                        2021-08-16 11:14:08      1     1    1
3                        2021-08-10 20:35:33      1     1    1
4  2021-08-31 14:32:13 | 2021-08-31 14:30:40      0     2    1

f = lambda x: pd.to_datetime([y.strip() for y in x.split('|')]).max()
df_pvt['date'] = df_pvt['date'].apply(f)

print (df_pvt)
                 date  calls  pres  pat
0 2021-08-17 13:20:23      1     1    1
1 2021-08-13 13:47:16      0     1    1
2 2021-08-16 11:14:08      1     1    1
3 2021-08-10 20:35:33      1     1    1
4 2021-08-31 14:32:13      0     2    1
  • Related