Have a Dataframe:
Column_A | Column_B |
---|---|
1 | 20 |
2 | 25 |
1 | 52 |
2 | 22 |
4 | 67 |
1 | 34 |
3 | 112 |
5 | 55 |
4 | 33 |
5 | 87 |
1 | 108 |
Looking to create 2 groups from Column_A, and find the average of those groups in Column_B:
So first group might be 1, 2 and 3, second group 4 and 5.
I get the basics behind groupby()
df.groupby("Column_A")["Column_B"].mean()
and calling certain values in columns
df[df["Column_A"] == 1].groupby()[].mean()
But is there a way to include the group of (1, 2 and 3) and (4, 5) from Column_A? Somehow doing:
[["Column_A"] == 1, 2, 3].groupby(Column_B).mean()
And:
[["Column_A"] == 4, 5].groupby(Column_B).mean()
Thanks in advance
CodePudding user response:
You can combine cut
to bin the first column, then groupby.mean
:
(df.groupby(pd.cut(df['Column_A'], [0,3,5], labels=['1-3', '4-5']))
['Column_B'].mean()
)
Output:
Column_A
1-3 53.285714
4-5 60.500000
Name: Column_B, dtype: float64
CodePudding user response:
df[df["Column_A"] <= 3].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"] > 3].groupby("Column_A")["Column_B"].mean()
if Column_A
is not numeric, use isin
CodePudding user response:
Using isin
if you only want a specific values group (1,2,3 and 4,5)
df[df["Column_A"].isin([1,2,3])].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"].isin([4,5])].groupby("Column_A")["Column_B"].mean()
CodePudding user response:
Looking to create 2 groups from Column_A, and find the average of those groups in Column_B
You can use a condition in groupby for the items in [1, 2, 3] vs. not in the list i.e. [4, 5].
df.groupby(df['Column_A'].isin([1, 2, 3]))['Column_B'].mean()
Output:
Column_A
False 60.500000
True 53.285714
Name: Column_B, dtype: float64