Group values for groupby().mean()-CodePudding

Have a Dataframe:

Looking to create 2 groups from Column_A, and find the average of those groups in Column_B:

So first group might be 1, 2 and 3, second group 4 and 5.

I get the basics behind groupby()

df.groupby("Column_A")["Column_B"].mean()

and calling certain values in columns

df[df["Column_A"] == 1].groupby()[].mean()

But is there a way to include the group of (1, 2 and 3) and (4, 5) from Column_A? Somehow doing:

[["Column_A"] == 1, 2, 3].groupby(Column_B).mean()

And:

[["Column_A"] == 4, 5].groupby(Column_B).mean()

Thanks in advance

CodePudding user response：

You can combine cut to bin the first column, then groupby.mean:

(df.groupby(pd.cut(df['Column_A'], [0,3,5], labels=['1-3', '4-5']))
   ['Column_B'].mean()
 )

Output:

Column_A
1-3    53.285714
4-5    60.500000
Name: Column_B, dtype: float64

CodePudding user response：

df[df["Column_A"] <= 3].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"] > 3].groupby("Column_A")["Column_B"].mean()

if Column_A is not numeric, use isin

CodePudding user response：

Using isin if you only want a specific values group (1,2,3 and 4,5)

df[df["Column_A"].isin([1,2,3])].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"].isin([4,5])].groupby("Column_A")["Column_B"].mean()

CodePudding user response：

Looking to create 2 groups from Column_A, and find the average of those groups in Column_B

You can use a condition in groupby for the items in [1, 2, 3] vs. not in the list i.e. [4, 5].

df.groupby(df['Column_A'].isin([1, 2, 3]))['Column_B'].mean()

Output:
Column_A
False    60.500000
True     53.285714
Name: Column_B, dtype: float64