Home > Software engineering >  How to save into .csv files based on weeks
How to save into .csv files based on weeks

Time:05-03

I have a data set from a .csv file with header created_at,text & lable as Below

created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport

How to save these into .csv file based on weeks. For example : I want to save into week1.csv file only the first 7 days values of above data set(from 2021-07-24 to 2021-07-30) & week2.csv(2021-07-31 to 2021-08-05) and so on

week1.csv

created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport

CodePudding user response:

IIUC you can compute a week period and use groupby:

group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')

for i, (g, d) in enumerate(df.groupby(group), start=1):
    print(f'saving week {i}: {g}')
    d.to_csv(f'week{i}.csv')

NB. using weeks ending on Fridays as period.

To compute this programatically from the first day use:

s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')

output:

saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13

files:

week1.csv
   created_at                         text  label
0  2021-07-24  Newzeland Wins the worldcup  Sport
1  2021-07-25        ABC Wins the worldcup  Sport
2  2021-07-26           Hello the worldcup  Sport
3  2021-07-27             Cricket worldcup  Sport
4  2021-07-28               Rugby worldcup  Sport
5  2021-07-29                     LLL Wins  Sport
6  2021-07-30        MMM Wins the worldcup  Sport

week2.csv
    created_at                   text  label
7   2021-07-31  RRR Wins the worldcup  Sport
8   2021-08-01  OOO Wins the worldcup  Sport
9   2021-08-02  JJJ Wins the worldcup  Sport
10  2021-08-03  YYY Wins the worldcup  Sport
11  2021-08-04  KKK Wins the worldcup  Sport
12  2021-08-05  YYY Wins the worldcup  Sport
13  2021-08-06  GGG Wins the worldcup  Sport

week3.csv
    created_at                   text  label
14  2021-08-07  FFF Wins the worldcup  Sport
15  2021-08-08  SSS Wins the worldcup  Sport
16  2021-08-09  XYZ Wins the worldcup  Sport
17  2021-08-10  PQR Wins the worldcup  Sport
  • Related