I have an investment portfolio where each security is given a percentage allocation. I want to use pandas to group the securities into Equity and Fixed Income.
df = pd.DataFrame()
invest_equity = 'Invest - Equity'
invest_fixedincome = 'Invest - Fixed Income'
df['Name'] = ['Invest - Equity ','Invest - Equity', 'Invest - Fixed Income', 'Invest - Fixed Income']
df['Allocation %'] = ['50','10','10','30']
So far I have been able to sum the groups and store the variable:
x = df[df['Name']==invest_equity]['Allocation %'].sum()
y = df[df['Name']==invest_fixedincome]['Allocation %'].sum()
From here I do not know how to get the desired result. I have tried the following which doesn't give an error but also does not update the excel sheet.
df[df['Name']==invest_fixedincome]['Allocation %'].div(x).mul(100)
The end result should be:
df['Allocation %'] = ['83.33','16.67','25','75']
CodePudding user response:
from a quick look, it looks like you're storing "Allocation %" as a string instead of an int or a float. So that may be the reason why computations aren't working.
Try
df['Allocation %'] = [50, 10, 10, 30]
without the ' '
CodePudding user response:
You can try
df['Allocation %'] = (df.assign(**{'Allocation %': df['Allocation %'].astype(int),
'Name': df['Name'].str.strip()})
.groupby('Name')['Allocation %'].transform(lambda col: col / col.sum())
.mul(100).round(2))
print(df)
Name Allocation %
0 Invest - Equity 83.33
1 Invest - Equity 16.67
2 Invest - Fixed Income 25.00
3 Invest - Fixed Income 75.00