Home > other >  I'm trying to create a new colm with the MAX Value from a Range of Months in the dataframe but
I'm trying to create a new colm with the MAX Value from a Range of Months in the dataframe but

Time:10-27

enter image description here

This is the data format date column shows the first entry date and i am trying to get the max of due missed in the first 3 months and i have data of 30 months and i need max of dues missed in the first 3 months for each id

CodePudding user response:

Considering your dataframe, as shown in your picture:

import pandas as pd
df = pd.DataFrame({
    'ID':[100,100,100,100,100,
          101,101,101,101, 101],
    'Entry Date':['2020-04-10','2020-05-10','2020-06-10','2020-07-10','2020-08-10',
                  '2020-07-25','2020-08-25','2020-09-25','2020-10-25','2020-11-25'],
    'Due missed':[0,0,7,0,5,
                  9,1,5,7,10]
})
df['Entry Date'] = df['Entry Date'].apply(pd.to_datetime)

What you will want to do is sort the dataframe so the first 3 months are at the top per ID:

df.sort_values(['ID', 'Entry Date'], inplace=True)

Then you can group by ID and select the top 3 rows (using head(3)), and select the maximum value of those three rows (using ['Due missed'].max()):

df.groupby('ID').apply(lambda x: x.head(3)['Due missed'].max())
  • Related