I have a data frame like this:
Group Col A Col B
1 A1 B1
1 A2 B2
2 A3 B3
2 A4 B4
I want to create a new columns Per A and Per B respecting grouops, so the outcome should be
Group Col A Col B Per A Per B
1 A1 B1 100*A1/(A1 A2) 100*B1/(B1 B2)
1 A2 B2 100*A2/(A1 A2) 100*B2/(B1 B2)
2 A3 B3 100*A3/(A3 A4) 100*B3/(B3 B4)
2 A4 B4 100*A4/(A3 A4) 100*B4/(B3 B4)
I need a general case, with several groups each of which having different group sizes.
I tried using a for loop, and while I am able to find the columns, I cannot assign them to the data frame. I don't understand what is exactly that prevents it.
For example, this is a result I'd be looking for (note how I change the Group column reflecting "different group sizes")
Group Col A Col B Per A Per B
1 1 2 100.0 100.0
2 1 2 16.67 25.00
2 2 2 33.33 25.00
2 3 4 50.00 50.00
CodePudding user response:
df = pd.DataFrame({
'Group': [1,1,2,2,3,3,3],
'ColA': [1,2,3,4,5,6,7],
'ColB': [10,22,30,40,50,60,70],
})
df = df.merge(df.groupby(['Group'])['ColA', 'ColB'].sum().reset_index(),
left_on='Group', right_on='Group')
df['PerA'] = df['ColA_x']*100/df['ColA_y']
df['PerB'] = df['ColB_x']*100/df['ColB_y']
df = df.rename(
columns={'ColA_x': 'ColA', 'ColB_x': 'ColB'}).drop(
columns=['ColA_y', 'ColB_y'])
print (df)
output:
Group ColA ColB PerA PerB
0 1 1 10 33.333333 31.250000
1 1 2 22 66.666667 68.750000
2 2 3 30 42.857143 42.857143
3 2 4 40 57.142857 57.142857
4 3 5 50 27.777778 27.777778
5 3 6 60 33.333333 33.333333
6 3 7 70 38.888889 38.888889
CodePudding user response:
Groupby group and then sum. That gives you the colum sum per group.
Set group as index and then divide by the outcome above. index makes it possible to only divide similar index terms. Code below
df.set_index('group').div(df.groupby('group').sum())*100
CodePudding user response:
Try groupby
transform
with update
df.update(df.div(df.groupby('Group').transform('sum'))*100)
df
Out[478]:
Group ColA ColB
0 1 33.333333 31.250000
1 1 66.666667 68.750000
2 2 42.857143 42.857143
3 2 57.142857 57.142857
4 3 27.777778 27.777778
5 3 33.333333 33.333333
6 3 38.888889 38.888889