Calculate Percent of Groupby Variable to Sum Column-CodePudding

I'm not finding a similar example to understand this in python. I have a dataset that looks like this:

ID    Capacity
A     50
A     50
A     50
B     30
B     30
B     30
C    100
C    100
C    100

I need to find the percent of each ID for the sum of the "Capacity" column. So, the answer looks like this:

ID    Capacity   Percent_Capacity
A     50         0.2777
A     50         0.2777
A     50         0.2777
B     30         0.1666
B     30         0.1666
B     30         0.1666
C    100         0.5555
C    100         0.5555
C    100         0.5555

Thank you - still learning python.

CodePudding user response：

total=df.groupby('ID')['Capacity'].first().sum()
df['percent_capacity'] = df['Capacity']/total
df

    ID  Capacity    percent_capacity
0   A         50    0.277778
1   A         50    0.277778
2   A         50    0.277778
3   B         30    0.166667
4   B         30    0.166667
5   B         30    0.166667
6   C        100    0.555556
7   C        100    0.555556
8   C        100    0.555556

CodePudding user response：

Using drop_duplicates:

df['percent_capacity'] = df['Capacity']/df.drop_duplicates(subset='ID')['Capacity'].sum()

Output:

  ID  Capacity  percent_capacity
0  A        50          0.277778
1  A        50          0.277778
2  A        50          0.277778
3  B        30          0.166667
4  B        30          0.166667
5  B        30          0.166667
6  C       100          0.555556
7  C       100          0.555556
8  C       100          0.555556