currently trying to figure this out with code like this..but not quite yet able to get it:
df[(df.index.day_of_week==0) & (df.index.day<15) & (df.shift(-4).index.day_of_week==4)]
this is what the data looks like. (i've added the day_of_week column for convenience). basically, i am trying to find the first day_of_week=0 (monday) in the month, then filter for the first day_of_week=4 after that (friday)
close day_of_week
date
2022-07-01 3825.330078 4
2022-07-05 3831.389893 1
2022-07-06 3845.080078 2
2022-07-07 3902.620117 3
2022-07-08 3899.379883 4
2022-07-11 3854.429932 0
2022-07-12 3818.800049 1
2022-07-13 3801.780029 2
2022-07-14 3790.379883 3
2022-07-15 3863.159912 4
...
2022-08-01 4118.629883 0
2022-08-02 4091.189941 1
2022-08-03 4155.169922 2
2022-08-04 4151.939941 3
2022-08-05 4145.189941 4
2022-08-08 4140.060059 0
2022-08-09 4122.470215 1
2022-08-10 4210.240234 2
2022-08-11 4207.270020 3
2022-08-12 4280.149902 4
...
2022-09-01 3966.850098 3
2022-09-02 3924.260010 4
2022-09-06 3908.189941 1
2022-09-07 3979.870117 2
2022-09-08 4006.179932 3
2022-09-09 4067.360107 4
2022-09-12 4110.410156 0
2022-09-13 3932.689941 1
2022-09-14 3946.010010 2
2022-09-15 3901.350098 3
2022-09-16 3873.330078 4
...
2022-10-03 3678.429932 0
2022-10-04 3790.929932 1
2022-10-05 3783.280029 2
2022-10-06 3744.520020 3
2022-10-07 3639.659912 4
2022-10-10 3612.389893 0
2022-10-11 3588.840088 1
2022-10-12 3577.030029 2
...
2022-11-01 3856.100098 1
2022-11-02 3759.689941 2
2022-11-03 3719.889893 3
2022-11-04 3770.550049 4
2022-11-07 3806.800049 0
2022-11-08 3828.110107 1
This should return:
2022-07-15 3863.159912 4
2022-08-05 4145.189941 4
2022-09-16 3873.330078 4
2022-10-07 3639.659912 4
EDIT: while i dont expect this to work, curious as to why this returns no results? is shifting not supported when filtering in this manner
df[(df.index.day_of_week==0) & (df.index.day<15) & (df.shift(-4).index.day_of_week==4)]
CodePudding user response:
As @wjandrea suggested, it would help to have output. Something that may work is to consider that the minimum value of this Friday would be the 5th day of the month, (given that Monday is at least day 1, the following Friday must be at least 5). Thus, something like an if day_of_month >=5
check would work.
If that's not possible, I can edit this post once I can see some output.
CodePudding user response:
You can group by [year, month, day_of_week]
and do a cumcount
to assign to each row the number of times its day_of_week
has appeared in this month.
Then, grab the rows corresponding to the first monday of the month using the filter day_of_week == 0 & cumcount == 0
and shift their index by 4 days to get the following Fridays. Finally, we do an intersection to filter out shifted indexes that do not exist in the original frame.
wanted_indices = df.index[df.groupby([df.index.year, df.index.month, df['dow']]).cumcount().eq(0) & df['dow'].eq(0)].shift(4, 'D')
df.loc[wanted_indices.intersection(df.index)]
Result on your example dataframe
close dow
date
2022-07-15 3863.159912 4
2022-08-05 4145.189941 4
2022-09-16 3873.330078 4
2022-10-07 3639.659912 4