I have a dataframe that has columns like these:
Date temp_data holiday day
01.01.2000 10000 0 1
02.01.2000 0 1 2
03.01.2000 2000 0 3
..
..
..
30.01.2000 200 0 30
31.01.2000 0 1 31
01.02.2000 0 1 1
02.02.2000 2500 0 2
holiday = 0 when there is data present - indicates a working day
holiday = 1 when there is no data present - indicated a non-working day
I am trying to extract two new columns first_working_day_of_month and last_working_day_of_month
the dataframe should look like this
Date temp_data holiday day first_wd_of_month last_wd_of_month
01.01.2000 10000 0 1 1 0
02.01.2000 0 1 2 0 0
03.01.2000 2000 0 3 0 0
..
..
..
30.01.2000 200 0 30 0 1
31.01.2000 0 1 31 0 0
01.02.2000 0 1 1 0 0
02.02.2000 2500 0 2 1 0
Can anyone help me with this?
CodePudding user response:
Solution
# Convert the column to datetime
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
# mask the dates when there is holdiday
w = df['Date'].mask(df['holiday'] == 1)
# group the working dates by monthly frequency
g = w.groupby(df['Date'].dt.to_period('M'))
# transform each group to get the first and last working day per month
# then compare with current date to identify whether the current
# day is the first or last working day
df['first_wd_of_month'] = df['Date'] == g.transform('first')
df['last_wd_of_month' ] = (df['Date'] == g.transform('last')) & ~df['first_wd_of_month']
Result
Date temp_data holiday day first_wd_of_month last_wd_of_month
0 2000-01-01 10000 0 1 True False
1 2000-01-02 0 1 2 False False
2 2000-01-03 2000 0 3 False False
3 2000-01-30 200 0 30 False True
...
4 2000-01-31 0 1 31 False False
5 2000-02-01 0 1 1 False False
6 2000-02-02 2500 0 2 True False
CodePudding user response:
try:
import datetime
df
Date
0 01.01.2000
1 02.01.2000
2 03.01.2000
3 30.01.2000
4 31.01.2000
5 01.02.2000
6 02.02.2000
df['Date'] = pd.to_datetime(df['Date'])
def f(x):
next_day = x datetime.timedelta(days=1)
if x.month != next_day.month:
return 1
return 0
def g(x):
prev_day = x - datetime.timedelta(days=1)
if x.month != prev_day.month:
return 1
return 0
df['is_last_day_of_month'] = df['Date'].map(f)
df['is_first_day_of_month'] = df['Date'].map(g)
Date is_last_day_of_month is_first_day_of_month
0 2000-01-01 0 1
1 2000-02-01 0 1
2 2000-03-01 0 1
3 2000-01-30 0 0
4 2000-01-31 1 0
5 2000-01-02 0 0
6 2000-02-02 0 0