Home > database >  how to use .format in np.select?
how to use .format in np.select?

Time:04-25

I have the df bellow, and I am trying to make a status column with specific dates by row

But I get the informativos about DF object instead the value of row.

Code:

enter image description here

import pandas as pd
import numpy as np

data = {'local':  ['client', 'hub'],
    'delivery_date':  [pd.to_datetime('2022-04-24'), 'null'],
    'estimated_date':  [pd.to_datetime('2022-04-24'), pd.to_datetime('2022-04-26')],
    'max_date': ['delivery_date', 'estimated_date']
    }

 df = pd.DataFrame(data)
 cond = [(df["max_date"] == "delivery_date")  & (df["local"] == "client"),
         (df["max_date"] == "estimated_date") & (df["local"] == "hub")]

 choices = ["It was delivered to the customer on the date {}".format(df["delivery_date"]),
            "delivery forecast for {}".format(df["estimated_date"])]

 df["status"] = np.select(cond, choices, default = np.nan)

expected result: enter image description here

CodePudding user response:

You could use string concatenation:

choices = ["It was delivered to the customer on the date "   df["delivery_date"].astype(str),
            "delivery forecast for "   df["estimated_date"].astype(str)]
df["status"] = np.select(cond, choices, default = np.nan)

As @Parfait notes, if you have pandas>=1.0.0, please use astype("string") for the StringDtype introduced. So

choices = ["It was delivered to the customer on the date "   df["delivery_date"].astype('string'),
            "delivery forecast for "   df["estimated_date"].astype('string')]

Output:

    local        delivery_date estimated_date        max_date                                             status  
0  client  2022-04-24 00:00:00     2022-04-24   delivery_date  It was delivered to the customer on the date 2...   
1     hub                 null     2022-04-26  estimated_date                   delivery forecast for 2022-04-26  

CodePudding user response:

Since Series objects contain many values and str.format expects a singular value, consider Series string concatenation with Series.str.cat.

...

# ADD HELPER COLUMNS
df["delivery_note"] = "It was delivered to the customer on the date "
df["forecast_note"] = "delivery forecast for " 

choices = [
    df["delivery_note"].str.cat(df["delivery_date"]),
    df["forecast_note"].str.cat(df["estimated_date"])
]

df["status"] = np.select(cond, choices, default = np.nan)

# REMOVE HELPER COLUMNS
df.drop(["delivery_note", "forecast_note"], axis="columns", inplace=True)
  • Related