Home > Net >  Python function to get or count the number of days between two years in a pandas data frame
Python function to get or count the number of days between two years in a pandas data frame

Time:11-08

I need a function to count the total number of days in the 'days' column between a start date of 1st Jan 1995 and an end date of 31st Dec 2019 in a dataframe taking Leapyears into account as well

For example:

  • 1st Jan 1995 - Day 1
  • 1st Feb 1995 - Day 32
  • 2nd Feb 1995 - Day 33...

And so on all the way to 31st Dec 2019.

Screenshot of the 1st ten rows of the dataset

This is the function I created initially but it doesn't work. prices is the name of the data frame and 'days' is the column where the number of days is to reflect.

def date_difference(self):

    for i in range(prices.shape[0] - 1):
        prices['days'][i 1] = (prices['days'][i 1] - prices['days'][i])

CodePudding user response:

Convert types

First of all, make sure that the days column is the proper type. Use df.days.dtype and it should be datetime64. If you get object type that means you have a string containing a date and you need to convert the type using

df.days = pd.to_datetime(df.days)

Calculate difference

df['days_diff'] = (df.days - pd.Timestamp('1995-01-01')).dt.days

Also, I would recommend changing the name of the column to date before it contains dates. Later you can assign the days to a column called so. It's just for clarity of your code and future maintaining it.

CodePudding user response:

I finally got it to work by doing this:

def date_difference(last_day):
    last_day = pd.to_datetime(last_day, dayfirst = True)
    first_day = pd.to_datetime("01/01/1995", dayfirst = True)
    diff = last_day - first_day

prices['days'] = prices['days'].apply(date_difference)
  • Related