I have this dataset
Country | Date | ValueA |
---|---|---|
USA | 2020-02-01 | 325 |
USA | 2020-02-02 | 444 |
USA | 2020-02-03 | 125 |
USA | 2020-02-04 | 456 |
USA | 2020-02-05 | 325 |
USA | 2020-02-06 | 465 |
USA | 2020-02-07 | 548 |
USA | 2020-02-08 | 696 |
FRA | 2020-02-01 | 357 |
FRA | 2020-02-02 | 123 |
FRA | 2020-02-03 | 125 |
FRA | 2020-02-04 | 987 |
FRA | 2020-02-05 | 343 |
FRA | 2020-02-06 | 874 |
FRA | 2020-02-07 | 658 |
FRA | 2020-02-08 | 413 |
I want to create a sequence for each country to later group by country and every 4 days. I'm looking for something like this
Country | Date | ValueA | every_4 |
---|---|---|---|
USA | 2020-02-01 | 325 | 1 |
USA | 2020-02-02 | 444 | 1 |
USA | 2020-02-03 | 125 | 1 |
USA | 2020-02-04 | 456 | 1 |
USA | 2020-02-05 | 325 | 2 |
USA | 2020-02-06 | 465 | 2 |
USA | 2020-02-07 | 548 | 2 |
USA | 2020-02-08 | 696 | 2 |
FRA | 2020-02-01 | 357 | 1 |
FRA | 2020-02-02 | 123 | 1 |
FRA | 2020-02-03 | 125 | 1 |
FRA | 2020-02-04 | 987 | 1 |
FRA | 2020-02-05 | 343 | 2 |
FRA | 2020-02-06 | 874 | 2 |
FRA | 2020-02-07 | 658 | 2 |
FRA | 2020-02-08 | 413 | 2 |
In R i belive it would be something like
df = df %>% groupby(Country) %>% mutate(Every_4 = rep(seq(),4))
Could someone please help? Thanks a lot!!
CodePudding user response:
Use:
df['every_4'] = df.groupby('Country').transform(lambda x: x.groupby(x.index//4).ngroup() 1)['ValueA']
Output:
Country Date ValueA every_4
0 USA 2020-02-01 325 1
1 USA 2020-02-02 444 1
2 USA 2020-02-03 125 1
3 USA 2020-02-04 456 1
4 USA 2020-02-05 325 2
5 USA 2020-02-06 465 2
6 USA 2020-02-07 548 2
7 USA 2020-02-08 696 2
8 FRA 2020-02-01 357 1
9 FRA 2020-02-02 123 1
10 FRA 2020-02-03 125 1
11 FRA 2020-02-04 987 1
12 FRA 2020-02-05 343 2
13 FRA 2020-02-06 874 2
14 FRA 2020-02-07 658 2
15 FRA 2020-02-08 413 2
CodePudding user response:
Use groupby_cumcount
:
df['every_4'] = df.groupby('Country').cumcount().floordiv(4).add(1)
print(df)
# Output:
Country Date ValueA every_4
0 USA 2020-02-01 325 1
1 USA 2020-02-02 444 1
2 USA 2020-02-03 125 1
3 USA 2020-02-04 456 1
4 USA 2020-02-05 325 2
5 USA 2020-02-06 465 2
6 USA 2020-02-07 548 2
7 USA 2020-02-08 696 2
8 FRA 2020-02-01 357 1
9 FRA 2020-02-02 123 1
10 FRA 2020-02-03 125 1
11 FRA 2020-02-04 987 1
12 FRA 2020-02-05 343 2
13 FRA 2020-02-06 874 2
14 FRA 2020-02-07 658 2
15 FRA 2020-02-08 413 2