Home > Enterprise >  Get the average of another column, given conditions in Pandas
Get the average of another column, given conditions in Pandas

Time:05-08

I have the following "source" dataframe

            Value
Date             
2022-01-01     26
2022-01-02     87
2022-01-03      4
2022-01-04     65
2022-01-05     89
2022-01-06     88
2022-01-07     82
2022-01-08     60
2022-01-09      8
2022-01-10     22

In another "output" dataframe, I have 4 columns defining 2 ranges of dates. As you can see, range1 is always defined, whilst range2 is defined only in certain rows:

  range1_lower range1_upper range2_lower range2_upper
0   2022-01-02   2022-01-03   2022-01-06   2022-01-07
1   2022-01-03   2022-01-06          NaN          NaN

In the output df I need to add an "average" column where, for each row, I get the average of the "Value" column of the source df, if the Date falls in either of the two ranges. This is something similar to the AVERAGEIFS function in Excel.

So, for example, for row 0 I need to get the average of all Values falling within range1 (2022-01-02 to 2022-01-03) or within range2 (2022-01-06 to 2022-01-07).

For row 1, since range2 is not defined, I only need to get the average of all Values falling within range1 (2022-01-03 to 2022-01-06).

This is the expected output:

  range1_lower range1_upper range2_lower range2_upper  average
0   2022-01-02   2022-01-03   2022-01-06   2022-01-07    65.25
1   2022-01-03   2022-01-06          NaN          NaN    61.50

Is there a way to do this? Thanks

CodePudding user response:

You can cross merge the two dataframes then filter out the dates that do not fall between the range limits then group by the range columns and aggregate Value with mean:

df = output.merge(source.reset_index(), how='cross')
df = df.query("range1_lower <= Date <= range1_upper or range2_lower <= Date <= range2_upper")
df = df.groupby([*output.columns], dropna=False)['Value'].mean().reset_index(name='avg')

print(df)

  range1_lower range1_upper range2_lower range2_upper    avg
0   2022-01-02   2022-01-03   2022-01-06   2022-01-07  65.25
1   2022-01-03   2022-01-06          NaT          NaT  61.50
  • Related