Let's say I have two dates in ISO format:
startDate = '2005-01-20'
endDate = '2010-08-20'
I also have a column in my dataframe called eventDate
, which is a string (but in ISO format).
How can I create a new column called validDate
that returns either 'Yes' or 'No' if the eventDate
is in between those two dates?
CodePudding user response:
startDate_dt = datetime.datetime.strptime(startDate, "%Y-%m-%d")
endDate_dt = datetime.datetime.strptime(endDate, "%Y-%m-%d")
df["validDate"] = df["eventDate"].apply(lambda x: \
(datetime.datetime.strptime(x, "%Y-%m-%d") <= endDate_dt) & \
(datetime.datetime.strptime(x, "%Y-%m-%d") >= startDate_dt))
CodePudding user response:
Let's say the eventDate column contains all days for September:
import pandas as pd
df = pd.DataFrame({
"eventDate": pd.date_range(start="2022-09-01", end="2022-09-30").strftime('%Y-%m-%d'),
})
And you want the dates to be "valid" only for the first week of September. You could do it this way:
from datetime import datetime
d1 = datetime(year=2022, month=9, day=1) # first day valid
d2 = datetime(year=2022, month=9, day=7) # last day valid
def is_date_valid(str_date: str):
dt = datetime.fromisoformat(str_date)
return ((dt >= d1) and (dt <= d2))
df["validDate"] = df["eventDate"].apply(is_date_valid)
The result you get:
df.head(10)
eventDate validDate
0 2022-09-01 True
1 2022-09-02 True
2 2022-09-03 True
3 2022-09-04 True
4 2022-09-05 True
5 2022-09-06 True
6 2022-09-07 True
7 2022-09-08 False
8 2022-09-09 False
9 2022-09-10 False
CodePudding user response:
First change eventDate
to datetime format, then add a simple condition.
df.eventDate = pd.to_datetime(df.eventDate)
df['validDate'] = df.eventData.apply(lambda x: "Yes" if pd.to_datetime(startDate) <= x <= pd.to_datetime(endDate) else "No")
Note pandas to_datetime
will automatically parse the date in the required format.