I have two dataframes in Python:
product_id product_name order_name qty
01 ABC A1 1
01 ABC A2 2
01 ABC A3 3
product_name ship_date ship_qty
ABC 01/01/2022 1
ABC 01/02/2022 0
ABC 01/03/2022 2
ABC 01/04/2022 1
ABC 01/05/2022 3
I want to create a new column that tests whether the ship_qty is equal to or greater than qty, essentially telling me what ship date has enough shipped quantity to fulfill an order.
I want the results to look something like this:
product_id product_name order_name qty ship_date
01 ABC A1 1 01/01/2022
01 ABC A2 2 01/03/2022
01 ABC A3 3 01/05/2022
I'm assuming I would need a for loop for this paired with an if statement, but I'm mostly lost on how to return the ship_date into a new column based on the ship_qty parameter.
CodePudding user response:
Use merge_asof
with direction='forward'
parameter and remove column ship_qty
:
df = (pd.merge_asof(df1.sort_values('qty'),
df2.sort_values('ship_qty'),
by='product_name',
left_on='qty',
right_on='ship_qty',
direction='forward').drop('ship_qty', axis=1))
print (df)
product_id product_name order_name qty ship_date
0 1 ABC A1 1 01/01/2022
1 1 ABC A2 2 01/03/2022
2 1 ABC A3 3 01/05/2022
CodePudding user response:
Here is a solution with pd.merge
:
res = df1.merge(
df2[["ship_date", "ship_qty"]]
.sort_values(by="ship_date")
.drop_duplicates(subset="ship_qty"),
left_on="qty",
right_on="ship_qty",
).drop("ship_qty", axis=1)
print(res)
product_id product_name order_name qty ship_date
0 1 ABC A1 1 01/01/2022
1 1 ABC A2 2 01/03/2022
2 1 ABC A3 3 01/05/2022