Home > Software engineering >  How to multiply each row in dataframe by a different column of different dataframe and get sum of al
How to multiply each row in dataframe by a different column of different dataframe and get sum of al

Time:10-10

I am trying to multiply each row of a dataframe by values of a column in another dataframe ,

For example if I have the following dataframe:

df = pd.DataFrame({
     'FR': [4.0405, 4.0963, 4.3149],
     'GR': [1.7246, 1.7482, 1.8519],
     'IT': [804.74, 810.01, 860.13],
     'F':  [8.4, 10.01, 60.3]},
     index=['1980-01-01', '1980-02-01', '1980-03-01'])
df_ret = df.pct_change()

df2 = pd.DataFrame({'symbol':['FR','GR','IT','F'],
                    'weight' : [0.2,0.3,0.1,0.4]})

I want to multiply each element of each row in df_ret to df2[‘weight’]

Then find the sum of each row and populate in a new column as df[‘port’]

I have tried:

df_ret.mul(df2.weight)

but I got a larger dataframe with all values as NaN

any help is appreciated, thanks.

CodePudding user response:

We can take advantage of index alignment by setting the index (set_index) of df2 to symbol and multiplying df_ret by the newly indexed df2['weight'] aligning Series index with df_ret columns. Then we can sum across the rows (axis=1) with DataFrame.sum:

df_ret = df_ret.mul(df2.set_index('symbol')['weight'])
df_ret['Port'] = df_ret.sum(axis=1)

df_ret:

                  FR        GR        IT         F      Port
1980-01-01       NaN       NaN       NaN       NaN  0.000000
1980-02-01  0.002762  0.004105  0.000655  0.076667  0.084189
1980-03-01  0.010673  0.017795  0.006188  2.009590  2.044246

We can also set skipna=False if we want NaN in the first row instead of 0:

df_ret['Port'] = df_ret.sum(axis=1, skipna=False)
                  FR        GR        IT         F      Port
1980-01-01       NaN       NaN       NaN       NaN       NaN
1980-02-01  0.002762  0.004105  0.000655  0.076667  0.084189
1980-03-01  0.010673  0.017795  0.006188  2.009590  2.044246
  • Related