I'm trying some mapping and calculation between dataframes.
Is there any examples or anyone can help how to use python code to do this?
I've 2 dataframes:
products components
c1 c2 c3 c4
p1 1 0 1 0
p2 0 1 1 0
p3 1 0 0 1
p4 0 1 0 1
items cost
components i1 i2 i3 i4
c1 0 10 30 0
c2 20 10 0 0
c3 0 0 10 15
c4 20 0 0 30
The end results should be a dictionary contains the sum of the cost for each components and find the maximum:
{p1: [c1,c3] } -> {p1: [i2 i3,i3 i4] } -> {p1: [40,25] } -> {p1: 40 }
{p2: [c2,c3] } -> {p2: [i1 i2,i3 i4] } -> {p2: [30,25] } -> {p2: 30 }
{p3: [c1,c4] } -> {p3: [i2 i3,i1 i4] } -> {p3: [40,50] } -> {p3: 50 }
{p4: [c2,c4] } -> {p4: [i1 i2,i1 i4] } -> {p4: [30,50] } -> {p4: 50 }
CodePudding user response:
Try (df1
is your first DataFrame, df2
the second):
print(df1.apply(lambda x: df2.loc[x[x.eq(1)].index].sum(axis=1).max(), axis=1))
Prints:
p1 40
p2 30
p3 50
p4 50
dtype: int64
To store the result to dictionary:
out = dict(
df1.apply(lambda x: df2.loc[x[x.eq(1)].index].sum(axis=1).max(), axis=1)
)
print(out)
Prints:
{'p1': 40, 'p2': 30, 'p3': 50, 'p4': 50}
CodePudding user response:
You can use itertools.compress() on products data to find components required:
prod_df["comps"] = prod_df.apply(lambda x: list(itertools.compress(x.index[1:], x.values[1:])), axis=1)
[Out]:
product c1 c2 c3 c4 comps
0 p1 1 0 1 0 [c1, c3, comps]
1 p2 0 1 1 0 [c2, c3, comps]
2 p3 1 0 0 1 [c1, c4, comps]
3 p4 0 1 0 1 [c2, c4, comps]
Then select respective component rows from components data and sum each row and filter max row:
prod_df["max_cost"] = prod_df.apply(lambda x: comp_df[comp_df["component"].isin(x["comps"])].iloc[:,1:].sum(axis=1).max(), axis=1)
[Out]:
product max_cost
0 p1 40
1 p2 30
2 p3 50
3 p4 50
Datasets used:
prod_data = [
("p1",1,0,1,0),
("p2",0,1,1,0),
("p3",1,0,0,1),
("p4",0,1,0,1),
]
prod_columns = ["product","c1","c2","c3","c4"]
prod_df = pd.DataFrame(data=prod_data, columns=prod_columns)
comp_data = [
("c1",0,10,30,0),
("c2",20,10,0,0),
("c3",0,0,10,15),
("c4",20,0,0,30),
]
comp_columns = ["component","i1","i2","i3","i4"]
comp_df = pd.DataFrame(data=comp_data, columns=comp_columns)