I have a data set from a .csv file with header created_at
,text
& lable
as Below
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport
How to save these into .csv file based on weeks. For example : I want to save into week1.csv file only the first 7 days values of above data set(from 2021-07-24 to 2021-07-30) & week2.csv(2021-07-31 to 2021-08-05) and so on
week1.csv
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
CodePudding user response:
IIUC you can compute a week period and use groupby
:
group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')
for i, (g, d) in enumerate(df.groupby(group), start=1):
print(f'saving week {i}: {g}')
d.to_csv(f'week{i}.csv')
NB. using weeks ending on Fridays as period.
To compute this programatically from the first day use:
s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')
output:
saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13
files:
week1.csv
created_at text label
0 2021-07-24 Newzeland Wins the worldcup Sport
1 2021-07-25 ABC Wins the worldcup Sport
2 2021-07-26 Hello the worldcup Sport
3 2021-07-27 Cricket worldcup Sport
4 2021-07-28 Rugby worldcup Sport
5 2021-07-29 LLL Wins Sport
6 2021-07-30 MMM Wins the worldcup Sport
week2.csv
created_at text label
7 2021-07-31 RRR Wins the worldcup Sport
8 2021-08-01 OOO Wins the worldcup Sport
9 2021-08-02 JJJ Wins the worldcup Sport
10 2021-08-03 YYY Wins the worldcup Sport
11 2021-08-04 KKK Wins the worldcup Sport
12 2021-08-05 YYY Wins the worldcup Sport
13 2021-08-06 GGG Wins the worldcup Sport
week3.csv
created_at text label
14 2021-08-07 FFF Wins the worldcup Sport
15 2021-08-08 SSS Wins the worldcup Sport
16 2021-08-09 XYZ Wins the worldcup Sport
17 2021-08-10 PQR Wins the worldcup Sport