I have to create a function the brings me the count of days between two dates and it must take out weekends and holidays that are inside of a dataframe.
my holidays df looks like this:
Data
0 2001-01-01
1 2001-02-26
2 2001-02-27
3 2001-04-13
4 2001-04-21
df.info()
class 'pandas.core.frame.DataFrame'
RangeIndex: 936 entries, 0 to 935
Data columns (total 1 columns):
Column Non-Null Count Dtype
0 Data 936 non-null datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 7.4 KB
So it should look like:
def delta_days (date_initial, date_end, holidays)
.....
What would be the best way?
CodePudding user response:
Okay, I'm assuming our date_initial and date_end are python dates, and all the dates in your holidaysdf are datetime.
def delta_days (date_initial, date_end, holidaysdf):
total_number_of_days=(date_end-date_initial).days
holidaysnum=len(holidaysdf[(holidaysdf['Data']>=date_initial ) & (holidaysdf['Data'] <= date_end)].index)
return total_number_of_days-holidaysnum
CodePudding user response:
Here you go:
import pandas as pd
import datetime
from datetime import datetime, timedelta
def delta_days(date_initial, date_end, holidays):
date_initial = datetime.strptime(date_initial,'%Y-%m-%d')
date_end = datetime.strptime(date_end,'%Y-%m-%d')
work_days = pd.bdate_range(start=date_initial, end=date_end, holidays=holidays, freq='C')
return(len(work_days))
Testing the code:
holidays = ['2021-01-01','2021-04-04','2021-04-21','2021-05-01','2021-09-07','2021-10-12','2021-11-02','2021-11-15','2021-12-25']
delta_days('2021-01-01','2021-12-31',holidays=holidays)
Output:
255
Now, you can go one step further and automate the construction of the holidays list:
from workalendar.america import Brazil
cal = Brazil()
datetime_feriados = pd.to_datetime([d[0] for d in cal.holidays(2021)])
lista_feriados = [lista_feriados.strftime('%Y-%m-%d') for lista_feriados in datetime_feriados]
Output:
lista_feriados
['2021-01-01','2021-04-04','2021-04-21','2021-05-01','2021-09-07','2021-10-12','2021-11-02','2021-11-15','2021-12-25']