Home > Software engineering >  How to select dates for conditions from two columns using np.where in python
How to select dates for conditions from two columns using np.where in python

Time:06-23

I am trying to create a column from two columns based on a condition.

I have two columns "date_1" and "date_2", I am interested in the column "date_2" but it contains NaT values, for this reason I would like to create a new column "date_3" with the values ​​of column 2, and replace the NaT values ​​with the dates from the "date_1" column.

I am trying with np.where method, but the output is not the correct. Below is the code I am using.

date_1 = ('2021-05-25 00:00:00','2021-05-28 00:00:00', '2021-05-29 00:00:00', '2021-05-29 00:00:00')
date_2 = ('2021-05-21 00:00:00',pd.NaT, '2021-05-26 00:00:00', pd.NaT)
df= pd.DataFrame(list(zip(date_1, date_2)), columns = ['date_1','date_2'])
df['date_1'] = pd.to_datetime(df['date_1'])
df['date_2'] = pd.to_datetime(df['date_2'])

df["date_3"] = np.where(df['date_1'] == pd.NaT, df['date_2'], df['date_1'])

CodePudding user response:

I think there is no need to use np.where. This is my solution to your problem:

df["date_3"] = df["date_2"]
df["date_3"] = df["date_3"].fillna(value=df["date_1"])
print(df)

Output:

      date_1     date_2     date_3
0 2021-05-25 2021-05-21 2021-05-21
1 2021-05-28        NaT 2021-05-28
2 2021-05-29 2021-05-26 2021-05-26
3 2021-05-29        NaT 2021-05-29
  • Related