I'm not finding a similar example to understand this in python. I have a dataset that looks like this:
ID Capacity
A 50
A 50
A 50
B 30
B 30
B 30
C 100
C 100
C 100
I need to find the percent of each ID for the sum of the "Capacity" column. So, the answer looks like this:
ID Capacity Percent_Capacity
A 50 0.2777
A 50 0.2777
A 50 0.2777
B 30 0.1666
B 30 0.1666
B 30 0.1666
C 100 0.5555
C 100 0.5555
C 100 0.5555
Thank you - still learning python.
CodePudding user response:
total=df.groupby('ID')['Capacity'].first().sum()
df['percent_capacity'] = df['Capacity']/total
df
ID Capacity percent_capacity
0 A 50 0.277778
1 A 50 0.277778
2 A 50 0.277778
3 B 30 0.166667
4 B 30 0.166667
5 B 30 0.166667
6 C 100 0.555556
7 C 100 0.555556
8 C 100 0.555556
CodePudding user response:
Using drop_duplicates
:
df['percent_capacity'] = df['Capacity']/df.drop_duplicates(subset='ID')['Capacity'].sum()
Output:
ID Capacity percent_capacity
0 A 50 0.277778
1 A 50 0.277778
2 A 50 0.277778
3 B 30 0.166667
4 B 30 0.166667
5 B 30 0.166667
6 C 100 0.555556
7 C 100 0.555556
8 C 100 0.555556