this is a basic quesiton and easy to do in excel but have not an idea in python and every example online uses groupby with multiple names in the name column. So, all I need is a row value of weights from a single column. Suppose I have data that looks like this:
name value
0 A 45
1 B 76
2 C 320
3 D 210
The answer should look like this:
0 name value weights
1 A 45 0.069124
2 B 76 0.116743
3 C 320 0.491551
4 D 210 0.322581
thank you,
CodePudding user response:
Use GroupBy.transform
for repeat sum
per groups, so possible divide original column:
print (df.groupby('name')['value'].transform('sum'))
0 651
1 651
2 651
3 651
Name: value, dtype: int64
df['weights'] = df['value'].div(df.groupby('name')['value'].transform('sum'))
print (df)
name value weights
0 A 45 0.069124
1 A 76 0.116743
2 A 320 0.491551
3 A 210 0.322581
EDIT:
df['weights'] = df['value'].div(df['value'].sum())
print (df)
name value weights
0 A 45 0.069124
1 B 76 0.116743
2 C 320 0.491551
3 D 210 0.322581
CodePudding user response:
You can also groupby 'name' and then apply a function that divides each value by its group sum:
df['weights'] = df.groupby('name')['value'].apply(lambda x: x / x.sum())
Output:
name value weights
0 A 45 0.069124
1 A 76 0.116743
2 A 320 0.491551
3 A 210 0.322581
For new data:
df['weights'] = df['value'] / df['value'].sum()
name value weights
0 A 45 0.069124
1 B 76 0.116743
2 C 320 0.491551
3 D 210 0.322581