I have a data frame data
data_ = {'ID': [777, 777, 777,777,777,777],'Month':[1,1,1,2,2,2], 'Salary': [130,170,50,140,180,60], 'O': ["AC","BR","BR","AC","BR","BR"], 'D':["LF","AC","LF","LF","AC","LF"], 'B':[True,True,False,True,True,False]}
data = pd.DataFrame(data=data_)
for each subgroup of this data frame:
Subgroup=data.groupby(["ID","Month"])
I would like to add a new column NEW_Salary
filled with the values of Salary
where B is false
in each subgroup as show in the picture below. I don't know exactly how I can do that
CodePudding user response:
Filter by B False before groupby then join with your initial dataframe
Subgroup = data[~data['B']] \
.groupby(["ID","Month"]) \
.agg(NEW_SALARY = ('Salary', 'mean')) \
.reset_index()
final_df = data.merge(Subgroup, on=['ID', 'Month'])
final_df
Output:
CodePudding user response:
Let us try transform
with mask
df['New_Salary'] = df['Salary'].mask(df['B']).groupby([df['ID'],df['Month']]).transform('max')
df
Out[255]:
ID Month Salary O D B New_Salary
0 777 1 130 AC LF True 50.0
1 777 1 170 BR AC True 50.0
2 777 1 50 BR LF False 50.0
3 777 2 140 AC LF True 60.0
4 777 2 180 BR AC True 60.0
5 777 2 60 BR LF False 60.0
CodePudding user response:
If possible multiple False
values per groups and need aggregate them, e.g. by sum
create helper column new
filled by 0
if no match and then use GroupBy.transform
for new column:
data['New_Salary'] = (data.assign(new=data['Salary'].mask(data['B'], 0))
.groupby(["ID","Month"])['new']
.transform('sum'))
print (data)
ID Month Salary O D B New_Salary
0 777 1 130 AC LF True 50
1 777 1 170 BR AC True 50
2 777 1 50 BR LF False 50
3 777 2 140 AC LF True 60
4 777 2 180 BR AC True 60
5 777 2 60 BR LF False 60
If there is only one False
value per groups is possible use DataFrame.join
with MultiIndex Series filtered only False
rows:
s = data.loc[~data['B']].set_index(["ID","Month"])['Salary'].rename('New_Salary')
data = (data.join(s, on=['ID','Month']))
print (data)
ID Month Salary O D B New_Salary
0 777 1 130 AC LF True 50
1 777 1 170 BR AC True 50
2 777 1 50 BR LF False 50
3 777 2 140 AC LF True 60
4 777 2 180 BR AC True 60
5 777 2 60 BR LF False 60