Say I have 2 dataframes:
df1
id guid name item1 item2 item3 item4 item5 item6 item7 item8 item9
0 3031958124 85558-261955282 Alonso 85558-57439 85558-54608 85558-91361 85558-40647 85558-41305 85558-79979 85558-33076 85558-89956 85558-12554
1 3031958127 85558-261955282 Jeff 85558-57439 85558-39280 85558-91361 85558-55987 85558-83083 85558-79979 85558-33076 85558-41872 85558-12554
2 3031958129 85558-261955282 Mike 85558-57439 85558-39280 85558-91361 85558-55987 85558-40647 85558-79979 85558-33076 85558-88297 85558-12534
...
df2 where item_lookup
is the index
item_type cost value target
item_lookup
85558-57439 item1 9500 25.1 1.9
85558-54608 item2 8000 18.7 0.0
85558-91361 item3 7000 16.5 0.9
...
I want to add the sum of cost
, value
, and target
for each item1 through item9 using item_lookup (df2
) and store that as a column on df1.
So the result should look like: df1
id guid name item1 item2 item3 item4 item5 item6 item7 item8 item9 cost value target
0 3031958124 85558-261955282 Alonso 85558-57439 85558-54608 85558-91361 85558-40647 85558-41305 85558-79979 85558-33076 85558-89956 85558-12554 58000 192.5 38.3
1 3031958127 85558-261955282 Jeff 85558-57439 85558-39280 85558-91361 85558-55987 85558-83083 85558-79979 85558-33076 85558-41872 85558-12554 59400 183.2 87.7
2 3031958129 85558-261955282 Mike 85558-57439 85558-39280 85558-91361 85558-55987 85558-40647 85558-79979 85558-33076 85558-88297 85558-12534 58000 101.5 18.1
...
I've tried following similar solutions online that use .map
, however these examples are only for single columns whereas I am trying to sum values for 9 columns.
CodePudding user response:
You can do this by using df.apply
, basically looping through the rows and then looping through the items in the row and calculating the sum
Since i couldn't use your dfs because they are incomplete, i made mine.
given df1
:
item1 item2 item3
0 b e j
1 d a d
2 j b a
3 c j f
4 e f c
5 a d b
6 f c e
and df2
cost value target
item_lookup
a 19 20 12
b 16 14 14
c 20 18 18
d 17 12 14
e 20 15 17
f 19 20 12
j 11 17 12
you can use the following function to get what you need
def add_items(row):
row["cost"] = row["target"] = row["value"] = 0
# get the columns that have item in the name
cols = [col for col in df1.columns if "item" in col]
# get each of the columns look it up in df2 and add it to our new cols
for col in cols:
item_lookup = row[col]
lookup_result = df2.loc[item_lookup]
row["cost"] = lookup_result["cost"]
row["target"] = lookup_result["target"]
row["value"] = lookup_result["value"]
return row
and then apply it
>>> df1.apply(add_items, axis=1)
item1 item2 item3 cost target value
0 b e j 47 43 46
1 d a d 53 40 44
2 j b a 46 38 51
3 c j f 50 42 55
4 e f c 59 47 53
5 a d b 52 40 46
6 f c e 59 47 53
CodePudding user response:
I got a simpler solution here. First, save the item_lookup
values that correspond with cost, target and value to a dictionary. and then use .map()
and .sum()
to create columns:
df2.reset_index(drop=False, inplace=True)
map_cost = dict(zip(df2['item_lookup'], df2['cost']))
map_value = dict(zip(df2['item_lookup'], df2['value']))
map_target = dict(zip(df2['item_lookup'], df2['target']))
df1['cost'] = df1.apply(lambda x: x.map(map_cost)).sum(axis=1)
df1['value'] = df1.apply(lambda x: x.map(map_value)).sum(axis=1)
df1['target'] = df1.apply(lambda x: x.map(map_target)).sum(axis=1)
df1
Output:
id guid name item1 item2 item3 cost value target
0 3031958124 85558-261955282 Alonso 85558-57439 85558-54608 85558-91361 24500.0 60.3 2.8
1 3031958127 85558-261955282 Jeff 85558-57439 85558-39280 85558-91361 16500.0 41.6 2.8
2 3031958129 85558-261955282 Mike 85558-57439 85558-39280 85558-91361 16500.0 41.6 2.8