How do you take duplicate rows and change them iteratively?
Given this hypothetical table which lists year repeating 12 times, how do you conditionally change it so that each year is now a date for each Student (iterating through the months)?
This is the beginning table:
Student | Date | Grade |
---|---|---|
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Walton | 2018 | .77 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
Sam | 2018 | .86 |
End table:
Store | Date | Grade |
---|---|---|
Walton | 2018-01-01 | .77 |
Walton | 2018-02-01 | .77 |
Walton | 2018-03-01 | .77 |
Walton | 2018-04-01 | .77 |
Walton | 2018-05-01 | .77 |
Walton | 2018-06-01 | .77 |
Walton | 2018-07-01 | .77 |
Walton | 2018-08-01 | .77 |
Walton | 2018-09-01 | .77 |
Walton | 2018-10-01 | .77 |
Walton | 2018-11-01 | .77 |
Walton | 2018-12-01 | .77 |
Sam | 2018-01-01 | .86 |
Sam | 2018-02-01 | .86 |
Sam | 2018-03-01 | .86 |
Sam | 2018-04-01 | .86 |
Sam | 2018-05-01 | .86 |
Sam | 2018-06-01 | .86 |
Sam | 2018-07-01 | .86 |
Sam | 2018-08-01 | .86 |
Sam | 2018-09-01 | .86 |
Sam | 2018-10-01 | .86 |
Sam | 2018-11-01 | .86 |
Sam | 2018-12-01 | .86 |
Have attempted function using case_when
which did not work. Sorry if this has already been asked.
CodePudding user response:
You can create a sequence of date starting on the 1st of January, with 1 month intervals, and length as the number of observations per group.
df %>%
group_by(Student) %>%
mutate(Date = seq(as.Date(ISOdate(first(Date), 1, 1)), by = '1 month', length.out = n()))
Student Date Grade
<chr> <date> <dbl>
1 Walton 2018-01-01 0.77
2 Walton 2018-02-01 0.77
3 Walton 2018-03-01 0.77
4 Walton 2018-04-01 0.77
5 Walton 2018-05-01 0.77
6 Walton 2018-06-01 0.77
7 Walton 2018-07-01 0.77
8 Walton 2018-08-01 0.77
9 Walton 2018-09-01 0.77
10 Walton 2018-10-01 0.77
# ... with 14 more rows
If you can manually add the date, then use:
seq(as.Date("2018-01-01"), by = '1 month', length.out = n())
If you know that there are going to be twelve observations per group, use:
seq(as.Date("2018-01-01"), as.Date("2018-12-01"))
Using lubridate
, another option is to use the %m %
operator:
library(lubridate)
df %>%
group_by(Student) %>%
mutate(Date = ymd(first(Date), truncated = 2) %m % months(c(0:11)))
CodePudding user response:
You can try to use data.table
package :
library(data.table)
library(lubridate)
library(magrittr)
dt <- fread("
Student Date Grade
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86")
dt[,Month := 1:.N, by = Student] %>%
.[, NewDate := make_date (Date, Month, 1)]
Here is the result :
> dt
Student Date Grade Month NewDate
1: Walton 2018 0.77 1 2018-01-01
2: Walton 2018 0.77 2 2018-02-01
3: Walton 2018 0.77 3 2018-03-01
4: Walton 2018 0.77 4 2018-04-01
5: Walton 2018 0.77 5 2018-05-01
6: Walton 2018 0.77 6 2018-06-01
7: Walton 2018 0.77 7 2018-07-01
8: Walton 2018 0.77 8 2018-08-01
9: Walton 2018 0.77 9 2018-09-01
10: Walton 2018 0.77 10 2018-10-01
11: Walton 2018 0.77 11 2018-11-01
12: Walton 2018 0.77 12 2018-12-01
13: Sam 2018 0.86 1 2018-01-01
14: Sam 2018 0.86 2 2018-02-01
15: Sam 2018 0.86 3 2018-03-01
16: Sam 2018 0.86 4 2018-04-01
17: Sam 2018 0.86 5 2018-05-01
18: Sam 2018 0.86 6 2018-06-01
19: Sam 2018 0.86 7 2018-07-01
20: Sam 2018 0.86 8 2018-08-01
21: Sam 2018 0.86 9 2018-09-01
22: Sam 2018 0.86 10 2018-10-01
23: Sam 2018 0.86 11 2018-11-01
24: Sam 2018 0.86 12 2018-12-01
Student Date Grade Month NewDate
CodePudding user response:
Here is another dplyr
approach:
mutate
a helper column x
with 1 to 12,
paste
it together and wrap it around as.Date
library(dplyr)
df %>%
mutate(x = rep(1:12, length.out = n()),
Date = as.Date(paste(Date,x,"01",sep = "-")), .keep="unused")
Student Date Grade
1 Walton 2018-01-01 0.77
2 Walton 2018-02-01 0.77
3 Walton 2018-03-01 0.77
4 Walton 2018-04-01 0.77
5 Walton 2018-05-01 0.77
6 Walton 2018-06-01 0.77
7 Walton 2018-07-01 0.77
8 Walton 2018-08-01 0.77
9 Walton 2018-09-01 0.77
10 Walton 2018-10-01 0.77
11 Walton 2018-11-01 0.77
12 Walton 2018-12-01 0.77
13 Sam 2018-01-01 0.86
14 Sam 2018-02-01 0.86
15 Sam 2018-03-01 0.86
16 Sam 2018-04-01 0.86
17 Sam 2018-05-01 0.86
18 Sam 2018-06-01 0.86
19 Sam 2018-07-01 0.86
20 Sam 2018-08-01 0.86
21 Sam 2018-09-01 0.86
22 Sam 2018-10-01 0.86
23 Sam 2018-11-01 0.86
24 Sam 2018-12-01 0.86