Home > Blockchain >  I'm getting zeroes and negative days python and pandas csv file
I'm getting zeroes and negative days python and pandas csv file

Time:10-22

Is it possible to replace those zeroes and negative days in a csv file by using pandas library, I just need to replace any negative day with zeroes or thirty days

Here my sample:

import datetime
from dateutil.relativedelta import relativedelta
import pandas as pd
import numpy as np

df = pd.read_csv("csv_data.csv", encoding="UTF-8",
    parse_dates = [
        "firstdate",
        "lastdate",
    ], dayfirst=True
)
diff = pd.firstdate - pd.lastdate
#When i skip the following loop
"""for d in diff:
    if d < 0:
        df['thedif'] = 0
    else:
        df['thedif'] = diff"""
#and add this line below

  df['thedif'] = diff
index firstdate   lastdate   thedif
0     2021-03-02  2021-04-02 -31days
1     2021-04-02  2021-03-02 31days
2     2021-03-03  2021-03-03 0days

But if we use a where keyword with numpy we get

Invalid comparison between dtype=timedelta64[ns] and int

CodePudding user response:

Tou can convert days to integers by Series.dt.days:

diff = (df.firstdate - df.lastdate).dt.days

df['thedif'] = np.where(diff < 0, 0, diff)

Or compare by pd.Timedelta(0):

diff = df.firstdate - df.lastdate

df['thedif'] = np.where(diff < pd.Timedelta(0), pd.Timedelta(0), diff)
  • Related