Home > Back-end >  How do I return a value based on another column per group in pandas
How do I return a value based on another column per group in pandas

Time:05-04

I have a pandas dataframe such as:

group month value
1 1 2
1 2 2
1 3 3
2 1 7
2 2 8
2 3 8
3 1 9
3 2 0
3 3 1

And I want to calculate a new column ('want' in the below) equal to the value where month == 2, per group, as shown below:

group month value want
1 1 2 2
1 2 2 2
1 3 3 2
2 1 7 8
2 2 8 8
2 3 8 8
3 1 9 0
3 2 0 0
3 3 1 0

Anyone able to help?

CodePudding user response:

You can use map:

df['want'] = df['group'].map(df[df['month'] == 2].set_index('group')['value'])
print(df)

# Output
   group  month  value  want
0      1      1      2     2
1      1      2      2     2
2      1      3      3     2
3      2      1      7     8
4      2      2      8     8
5      2      3      8     8
6      3      1      9     0
7      3      2      0     0
8      3      3      1     0

CodePudding user response:

Guess I could just create a groupby df (groupby group where mth == 2) then merge back to it.

Will just go with that instead of attempting to do via a groupby.apply route!

CodePudding user response:

You can try this,

grouped_df = df.groupby("group")
groups = []
wants = []

for group in df["group"].unique():
  group_df = grouped_df.get_group(group)
  want = group_df.loc[group_df["month"] == 2, "value"].values[0]
  [wants.append(want) for i in range(group_df.shape[0])]

df["wants"] = wants

Output -

group month value wants
0 1 1 2 2
1 1 2 2 2
2 1 3 3 2
3 2 1 7 8
4 2 2 8 8
5 2 3 8 8
6 3 1 9 0
7 3 2 0 0
8 3 3 1 0

CodePudding user response:

You can use:

df['want'] = df['value'].where(df['month']==2).groupby(df['group']).transform('mean').astype(int)

Result

print(df)

     group  month   value   want
 0      1       1       2      2
 1      1       2       2      2
 2      1       3       3      2
 3      2       1       7      8
 4      2       2       8      8
 5      2       3       8      8
 6      3       1       9      0
 7      3       2       0      0
 8      3       3       1      0

In a case whereby for a certain group we have more than one observation for month==2, it will fill want with the mean value of those observations where month==2 for that group.

For example: Given this new dataframe(df_new)

group month value
0 1 1 2
1 1 2 2
2 1 2 4
3 1 3 3
4 2 1 7
5 2 2 8
6 2 3 8
7 3 1 9
8 3 2 0
9 3 3 1

Running

df_new['want'] = df_new['value'].where(df_mean['month']==2).groupby(df_new['group']).transform('mean').astype(int)

Result

print(df_new)

    group   month   value   want
0       1       1       2      3
1       1       2       2      3
2       1       2       4      3
3       1       3       3      3
4       2       1       7      8
5       2       2       8      8
6       2       3       8      8
7       3       1       9      0
8       3       2       0      0
9       3       3       1      0
  • Related