Home > Blockchain >  How do I return a value into a column within a dataframe based on parameters in another dataframe?
How do I return a value into a column within a dataframe based on parameters in another dataframe?

Time:06-09

I have two dataframes in Python:

product_id   product_name   order_name   qty
01           ABC            A1           1
01           ABC            A2           2
01           ABC            A3           3
product_name   ship_date   ship_qty
ABC            01/01/2022   1
ABC            01/02/2022   0
ABC            01/03/2022   2
ABC            01/04/2022   1
ABC            01/05/2022   3

I want to create a new column that tests whether the ship_qty is equal to or greater than qty, essentially telling me what ship date has enough shipped quantity to fulfill an order.

I want the results to look something like this:

product_id   product_name   order_name   qty   ship_date
01           ABC            A1           1     01/01/2022
01           ABC            A2           2     01/03/2022
01           ABC            A3           3     01/05/2022

I'm assuming I would need a for loop for this paired with an if statement, but I'm mostly lost on how to return the ship_date into a new column based on the ship_qty parameter.

CodePudding user response:

Use merge_asof with direction='forward' parameter and remove column ship_qty:

df = (pd.merge_asof(df1.sort_values('qty'), 
                    df2.sort_values('ship_qty'),
                    by='product_name', 
                    left_on='qty', 
                    right_on='ship_qty', 
                    direction='forward').drop('ship_qty', axis=1))
print (df)
   product_id product_name order_name  qty   ship_date
0           1          ABC         A1    1  01/01/2022
1           1          ABC         A2    2  01/03/2022
2           1          ABC         A3    3  01/05/2022

CodePudding user response:

Here is a solution with pd.merge:

res = df1.merge(
    df2[["ship_date", "ship_qty"]]
    .sort_values(by="ship_date")
    .drop_duplicates(subset="ship_qty"),
    left_on="qty",
    right_on="ship_qty",
).drop("ship_qty", axis=1)

print(res)

   product_id product_name order_name  qty   ship_date
0           1          ABC         A1    1  01/01/2022
1           1          ABC         A2    2  01/03/2022
2           1          ABC         A3    3  01/05/2022
  • Related