Home > Software design >  pandas: find the first friday following the first monday of the month
pandas: find the first friday following the first monday of the month

Time:11-09

currently trying to figure this out with code like this..but not quite yet able to get it:

df[(df.index.day_of_week==0) & (df.index.day<15) & (df.shift(-4).index.day_of_week==4)]

this is what the data looks like. (i've added the day_of_week column for convenience). basically, i am trying to find the first day_of_week=0 (monday) in the month, then filter for the first day_of_week=4 after that (friday)

                  close  day_of_week
date                                
2022-07-01  3825.330078            4
2022-07-05  3831.389893            1
2022-07-06  3845.080078            2
2022-07-07  3902.620117            3
2022-07-08  3899.379883            4
2022-07-11  3854.429932            0
2022-07-12  3818.800049            1
2022-07-13  3801.780029            2
2022-07-14  3790.379883            3
2022-07-15  3863.159912            4
...
2022-08-01  4118.629883            0
2022-08-02  4091.189941            1
2022-08-03  4155.169922            2
2022-08-04  4151.939941            3
2022-08-05  4145.189941            4
2022-08-08  4140.060059            0
2022-08-09  4122.470215            1
2022-08-10  4210.240234            2
2022-08-11  4207.270020            3
2022-08-12  4280.149902            4
...
2022-09-01  3966.850098            3
2022-09-02  3924.260010            4
2022-09-06  3908.189941            1
2022-09-07  3979.870117            2
2022-09-08  4006.179932            3
2022-09-09  4067.360107            4
2022-09-12  4110.410156            0
2022-09-13  3932.689941            1
2022-09-14  3946.010010            2
2022-09-15  3901.350098            3
2022-09-16  3873.330078            4
...
2022-10-03  3678.429932            0
2022-10-04  3790.929932            1
2022-10-05  3783.280029            2
2022-10-06  3744.520020            3
2022-10-07  3639.659912            4
2022-10-10  3612.389893            0
2022-10-11  3588.840088            1
2022-10-12  3577.030029            2
...
2022-11-01  3856.100098            1
2022-11-02  3759.689941            2
2022-11-03  3719.889893            3
2022-11-04  3770.550049            4
2022-11-07  3806.800049            0
2022-11-08  3828.110107            1

This should return:

2022-07-15  3863.159912            4
2022-08-05  4145.189941            4
2022-09-16  3873.330078            4
2022-10-07  3639.659912            4

EDIT: while i dont expect this to work, curious as to why this returns no results? is shifting not supported when filtering in this manner

df[(df.index.day_of_week==0) & (df.index.day<15) & (df.shift(-4).index.day_of_week==4)]

CodePudding user response:

As @wjandrea suggested, it would help to have output. Something that may work is to consider that the minimum value of this Friday would be the 5th day of the month, (given that Monday is at least day 1, the following Friday must be at least 5). Thus, something like an if day_of_month >=5 check would work.

If that's not possible, I can edit this post once I can see some output.

CodePudding user response:

You can group by [year, month, day_of_week] and do a cumcount to assign to each row the number of times its day_of_week has appeared in this month.

Then, grab the rows corresponding to the first monday of the month using the filter day_of_week == 0 & cumcount == 0 and shift their index by 4 days to get the following Fridays. Finally, we do an intersection to filter out shifted indexes that do not exist in the original frame.

wanted_indices = df.index[df.groupby([df.index.year, df.index.month, df['dow']]).cumcount().eq(0) & df['dow'].eq(0)].shift(4, 'D')
df.loc[wanted_indices.intersection(df.index)]

Result on your example dataframe

            close       dow
date        
2022-07-15  3863.159912 4
2022-08-05  4145.189941 4
2022-09-16  3873.330078 4
2022-10-07  3639.659912 4
  • Related