Pandas average row of dataframe based on range of column values-CodePudding

I have a dataframe that looks like the following

id       value
11         0
11         3
11         1
11         2
4          3
4          1
4          8
4          2
5          0
5          3
5          1
5          2
4          2
4          0
4          1
4          3
12         2
12         1
12         0
12         3

I'm hoping to get the mean of groups with ids in a range... i.e. mean of the first value of the first 0->5 group, and the first value of the 2nd 0->5 group, etc.

id       value
0-5       1.66
0-5       1.33
0-5       3.33
0-5       2.33
....

e.g. for 0 < id <= 5:

[3 0 2, 1 3 0, 8 1 1, 2 2 3]/3 = [1.66, 1.33, 3.33, 2.33]

CodePudding user response：

You could cut the ids into specific bins before groupby:

df["category"] = pd.cut(df["id"], [0,5,10,15], labels=["0-5","5-10","10-15"]).astype(str)
df["subgroup"] = df.groupby(df["id"].ne(df["id"].shift()).cumsum()).cumcount()
output = df.groupby(["category", "subgroup"],as_index=False)["value"].mean()

>>> output
  category  subgroup     value
0      0-5         0  1.666667
1      0-5         1  1.333333
2      0-5         2  3.333333
3      0-5         3  2.333333
4    10-15         0  1.000000
5    10-15         1  2.000000
6    10-15         2  0.500000
7    10-15         3  2.500000

CodePudding user response：

N = 5
df2 = pd.DataFrame(df.groupby("id").agg({'value': list}).apply(lambda x: [x.name//(N   1), x.value], axis=1).values.tolist()).groupby(0).agg({1:list})
df2["mini"] = df2.apply(lambda x: min([len(y) for y in x[1]]), axis=1)
df2 = df2.apply(lambda x: np.array([item for sublist in x[1] for item in sublist]).reshape(-1,x["mini"]).T.mean(axis=1), axis=1).explode()
df2 = pd.DataFrame(df2.values, index=df2.index)
df2.index = df2.apply(lambda x: str(x.name * N)   "-"   str((x.name   1) * N), axis=1)
df2.rename(columns={0: 'value'}, inplace=True)
df2

output:

    value
0-5 1.666667
0-5 1.333333
0-5 3.333333
0-5 2.333333
5-10    0.0
5-10    3.0
5-10    1.0
5-10    2.0
10-15   2.0
10-15   1.0
10-15   0.0
10-15   3.0