Home > front end >  Separate columns of a DataFrame by days of the week
Separate columns of a DataFrame by days of the week

Time:07-07

Let it be the following Python Panda Dataframe (the original could include dates for several months):

Hours 2022-06-06 2022-06-07 2022-06-08 2022-06-09 2022-06-10 2022-06-11 2022-06-12 2022-06-13 2022-06-14 2022-06-15 2022-06-16 2022-06-17 2022-06-18 2022-06-19
00:00 3 0 0 3 23 43 1 2 3 3 7 3 1 0
05:00 5 4 0 3 32 31 3 9 3 3 5 3 0 3
10:00 0 3 3 34 45 63 43 6 3 0 3 3 0 1
15:00 10 31 10 3 53 0 3 3 3 3 5 3 12 3
20:00 20 33 33 3 86 3 0 3 3 21 3 3 0 3
23:00 31 34 45 63 43 64 23 12 1 0 2 5 2 3

I want to create 2 dataframes by following the steps below:

  1. Remove the columns corresponding to Saturdays and Sundays.
Hours 2022-06-06 2022-06-07 2022-06-08 2022-06-09 2022-06-10 2022-06-13 2022-06-14 2022-06-15 2022-06-16 2022-06-17
00:00 3 0 0 3 23 2 3 3 7 3
05:00 5 4 0 3 32 9 3 3 5 3
10:00 0 3 3 34 45 6 3 0 3 3
15:00 10 31 10 3 53 3 3 3 5 3
20:00 20 33 33 3 86 3 3 21 3 3
23:00 31 34 45 63 43 12 1 0 2 5
  1. Separate the resulting Dataframe into 2, one with only Fridays and the other with the rest of the days (Monday, Tuesday, Wednesday and Thursday).
Hours 2022-06-10 2022-06-17
00:00 23 3
05:00 32 3
10:00 45 3
15:00 53 3
20:00 86 3
23:00 43 5
Hours 2022-06-06 2022-06-07 2022-06-08 2022-06-09 2022-06-13 2022-06-14 2022-06-15 2022-06-16
00:00 3 0 0 3 2 3 3 7
05:00 5 4 0 3 9 3 3 5
10:00 0 3 3 34 6 3 0 3
15:00 10 31 10 3 3 3 3 5
20:00 20 33 33 3 3 3 21 3
23:00 31 34 45 63 12 1 0 2

CodePudding user response:

You can use pandas.to_datetime and weekday to create a boolean indexer, then use boolean indexing:

# set index aside
df = df.set_index('Hours')

# get weekday (4 is Fri, 5 and 6 are Sat and Sun)

# compare to 5 (<5) to get True on weekdays
weekdays = pd.to_datetime(df.columns).weekday < 5
# or
# monthu = pd.to_datetime(df.columns).weekday < 4

# compare to 4 to get Fridays
fri = pd.to_datetime(df.columns).weekday == 4

# slice
mon_thu_df = df.loc[:, weekdays & ~fri].reset_index()
# or
# mon_thu_df = df.loc[:, monthu].reset_index()

fri_df = df.loc[:, fri].reset_index()

Mon-Thu:

   Hours  2022-06-06  2022-06-07  2022-06-08  2022-06-09  2022-06-13  2022-06-14  2022-06-15  2022-06-16
0  00:00           3           0           0           3           2           3           3           7
1  05:00           5           4           0           3           9           3           3           5
2  10:00           0           3           3          34           6           3           0           3
3  15:00          10          31          10           3           3           3           3           5
4  20:00          20          33          33           3           3           3          21           3
5  23:00          31          34          45          63          12           1           0           2

Fri:

   Hours  2022-06-10  2022-06-17
0  00:00          23           3
1  05:00          32           3
2  10:00          45           3
3  15:00          53           3
4  20:00          86           3
5  23:00          43           5

CodePudding user response:

For the first Step you can make something like this :

def allsundays(year):
        d = date(year, 1, 1)                    # January 1st
        d  = timedelta(days = 6 - d.weekday())  # First Sunday
        while d.year == year:
            yield d
            d  = timedelta(days = 7)
    
for d in allsundays(2022):
    print(d)

This little code can help you to get all Sunday

After getting the sunday list you can drop the columns :

df.drop(columns=sunday_list)
  • Related