I have a dataframe:
ID 2000-01 2000-02 2000-03 2000-04 2000-05 val
1 2847 2861 2875 2890 2904 94717
2 1338 1343 1348 1353 1358 70105
3 3301 3311 3321 3331 3341 60307
4 1425 1422 1419 1416 1413 79888
I want to add a new row to the table with the sumproduct formula (excel) =sumproduct(array $val, array 2000-xx). The first value in the new row is computed as 2847x94717 1338x70105 3301x60307 1425x79888 = 676373596 (in Excel terms, B2xG2 B3xG3 B4xG4 B5xG5)
Output:
ID 2000-01 2000-02 2000-03 2000-04 2000-05 val
1 2847 2861 2875 2890 2904 94717
2 1338 1343 1348 1353 1358 70105
3 3301 3311 3321 3331 3341 60307
4 1425 1422 1419 1416 1413 79888
5 676373596 678413565 680453534 682588220 684628189
How do I go about this?
CodePudding user response:
You could do the dot product @
and merge
back to the original dataframe:
df.merge(pd.DataFrame(df.iloc[:,1:-1].T @ df['val']).T, how='outer')
ID 2000-01 2000-02 2000-03 2000-04 2000-05 val
0 1.0 2847 2861 2875 2890 2904 94717.0
1 2.0 1338 1343 1348 1353 1358 70105.0
2 3.0 3301 3311 3321 3331 3341 60307.0
3 4.0 1425 1422 1419 1416 1413 79888.0
4 NaN 676373596 678413565 680453534 682588220 684628189 NaN
CodePudding user response:
You can do this, assuming ID is not in the index:
df.loc[5, :] = df.iloc[:,1:-1].mul(df['val'], axis=0).sum()
Output:
ID 2000-01 2000-02 2000-03 2000-04 2000-05 val
0 1.0 2847.0 2861.0 2875.0 2890.0 2904.0 94717.0
1 2.0 1338.0 1343.0 1348.0 1353.0 1358.0 70105.0
2 3.0 3301.0 3311.0 3321.0 3331.0 3341.0 60307.0
3 4.0 1425.0 1422.0 1419.0 1416.0 1413.0 79888.0
5 NaN 676373596.0 678413565.0 680453534.0 682588220.0 684628189.0 NaN
Use pandas.DataFrame.mul
with axis=0 then sum and let pandas intrinsic data alignment put the values in the correct column based on indexing.
CodePudding user response:
You can do:
row = [sum([v[0]*v[1] for v in list(zip(df[col],df['val']))]) for col in df.columns.drop(['ID','val'])]
row.insert(0, len(df) 1)
row.insert(len(row), 0)
df.loc[len(df) 1] = row
df.loc[len(df),'val'] = ''
CodePudding user response:
Other options for the same result
columns_to_multiply = df.columns.drop(['ID', 'val'])
df1 = df.copy()
for x in columns_to_multiply:
df1[x] *= df1['val']
prod_sum_list = [len(df)] df1[columns_to_multiply].sum().tolist() [np.nan]
df.loc[len(df.index)] = prod_sum_list
df