Calculate age between two dates in year where one column has single date and other column has list o-CodePudding

I have two columns, one has the single date and may have list of dates, it can be empty list also. I want to calculate the difference of age between first column and all the dates of the second column.

 column1             column2                     result

11-01-2014        [1975-12-16, 1980-07-24]      [39,34]
20-11-2014        [1985-08-05, 1983-03-16]      [29,31]
26-12-2016        [1966-05-22, 1958-04-13]      [50,58]
20-05-2016        [1981-04-21, 1983-12-25]      [35,33]
01-01-2016        [1993-10-29, 1966-06-27]      [23,50]

I have column1 and column2 as input and I am expecting output in the form result.

CodePudding user response：

Use DataFrame.explode for column instead lists, so possible subtract years by Series.dt.year, last aggregate lists:

df['column1'] = pd.to_datetime(df['column1'], dayfirst=True)
df1 = df.explode('column2')
df1['column2'] = pd.to_datetime(df1['column2'])

df1['result'] = df1['column1'].dt.year.sub(df1['column2'].dt.year)

df = df1.groupby([df1.index, 'column1']).agg(list).reset_index(level=1)
print (df)
     column1                                     column2    result
0 2014-01-11  [1975-12-16 00:00:00, 1980-07-24 00:00:00]  [39, 34]
1 2014-11-20  [1985-08-05 00:00:00, 1983-03-16 00:00:00]  [29, 31]
2 2016-12-26  [1966-05-22 00:00:00, 1958-04-13 00:00:00]  [50, 58]
3 2016-05-20  [1981-04-21 00:00:00, 1983-12-25 00:00:00]  [35, 33]
4 2016-01-01  [1993-10-29 00:00:00, 1966-06-27 00:00:00]  [23, 50]

Or use lambda function with convert lists to datetimes:

df['column1'] = pd.to_datetime(df['column1'], dayfirst=True)

f = lambda x: [x['column1'].year - y.year for y in  pd.to_datetime(x['column2'])]
df['result'] = df.apply(f, axis=1)

print (df)
     column1                   column2    result
0 2014-01-11  [1975-12-16, 1980-07-24]  [39, 34]
1 2014-11-20  [1985-08-05, 1983-03-16]  [29, 31]
2 2016-12-26  [1966-05-22, 1958-04-13]  [50, 58]
3 2016-05-20  [1981-04-21, 1983-12-25]  [35, 33]
4 2016-01-01  [1993-10-29, 1966-06-27]  [23, 50]