Home > Net >  Finding the min and max date from a timeseries range in pandas
Finding the min and max date from a timeseries range in pandas

Time:12-14

I have a dataframe laid out like the following with site names and a range of dates for each site.

Site       Date
Site_1     02/09/2011
Site_1     03/09/2011
Site_1     04/09/2011
Site_1     05/09/2011
Site_2     14/01/2010
Site_2     15/01/2010
Site_2     16/01/2010
Site_2     17/01/2010

I'd like to find the start and end dates for each site and lay the dataframe out like this:

Site       Start_Date     End_Date
Site_1     02/09/2011     04/09/2011
Site_2     14/01/2010     17/01/2010

I know I can find the min and max of the full range like this:

df['Date'].max()
df['Date'].min()

Just not sure what the best method is to apply this separately to each site.

CodePudding user response:

I would advise using a groupby on the "site" column and aggregating each group into a min and max date.

df.groupby("Site").agg({'date': ['min', 'max']})

This will return the min and max date for each site.

I haven't tried out the code, but it should do what you want.

CodePudding user response:

Using the advanced syntax of groupby agg you can directly get your custom format:

df.groupby('Site')['Date'].agg([('Start_Date', 'min'), ('End_Date', 'max')])

output:

        Start_Date    End_Date
Site                          
Site_1  02/09/2011  05/09/2011
Site_2  14/01/2010  17/01/2010
  • Related