Is it possible to replace those zeroes and negative days in a csv file by using pandas library, I just need to replace any negative day with zeroes or thirty days
Here my sample:
import datetime
from dateutil.relativedelta import relativedelta
import pandas as pd
import numpy as np
df = pd.read_csv("csv_data.csv", encoding="UTF-8",
parse_dates = [
"firstdate",
"lastdate",
], dayfirst=True
)
diff = pd.firstdate - pd.lastdate
#When i skip the following loop
"""for d in diff:
if d < 0:
df['thedif'] = 0
else:
df['thedif'] = diff"""
#and add this line below
df['thedif'] = diff
index firstdate lastdate thedif
0 2021-03-02 2021-04-02 -31days
1 2021-04-02 2021-03-02 31days
2 2021-03-03 2021-03-03 0days
But if we use a where keyword with numpy we get
Invalid comparison between dtype=timedelta64[ns] and int
CodePudding user response:
Tou can convert days to integers by Series.dt.days
:
diff = (df.firstdate - df.lastdate).dt.days
df['thedif'] = np.where(diff < 0, 0, diff)
Or compare by pd.Timedelta(0)
:
diff = df.firstdate - df.lastdate
df['thedif'] = np.where(diff < pd.Timedelta(0), pd.Timedelta(0), diff)