I want to add a new column to dataframe with a condition based on its date time index.
I used the following code: I already set the date values as index so that I'm working with the time index.
new_col= []
start_date= pd.to_datetime('2020-03-01 00:00:00')
end_date= pd.to_datetime('2020-03-07 00:00:00')
for idx in range(len(df)):
if df.index[idx] => start_date and df.index[idx] <= end_date:
new_col.append(1)
else:
new_col.append(2)
df["newC"] = new_col
I still get an error that the length of df and the new column are not equal- It was indicated that the length of new column is greater. I tried the numpy where method but I did not work as well.
Is there any better way to add value in a new column based on certain period of time condition for example in this case from '2020-03-01 00:00:00' until '2020-03-07 00:00:00'?
CodePudding user response:
This should work:
df["newC"] = pd.Series(df.index, index=df.index).apply(lambda dt: 1 if start_date <= dt <= end_date else 2)