Home > Back-end >  How do you change observations iteratively in R?
How do you change observations iteratively in R?

Time:02-23

How do you take duplicate rows and change them iteratively?

Given this hypothetical table which lists year repeating 12 times, how do you conditionally change it so that each year is now a date for each Student (iterating through the months)?

This is the beginning table:

Student Date Grade
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Walton 2018 .77
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86
Sam 2018 .86

End table:

Store Date Grade
Walton 2018-01-01 .77
Walton 2018-02-01 .77
Walton 2018-03-01 .77
Walton 2018-04-01 .77
Walton 2018-05-01 .77
Walton 2018-06-01 .77
Walton 2018-07-01 .77
Walton 2018-08-01 .77
Walton 2018-09-01 .77
Walton 2018-10-01 .77
Walton 2018-11-01 .77
Walton 2018-12-01 .77
Sam 2018-01-01 .86
Sam 2018-02-01 .86
Sam 2018-03-01 .86
Sam 2018-04-01 .86
Sam 2018-05-01 .86
Sam 2018-06-01 .86
Sam 2018-07-01 .86
Sam 2018-08-01 .86
Sam 2018-09-01 .86
Sam 2018-10-01 .86
Sam 2018-11-01 .86
Sam 2018-12-01 .86

Have attempted function using case_when which did not work. Sorry if this has already been asked.

CodePudding user response:

You can create a sequence of date starting on the 1st of January, with 1 month intervals, and length as the number of observations per group.

df %>% 
   group_by(Student) %>% 
   mutate(Date = seq(as.Date(ISOdate(first(Date), 1, 1)), by = '1 month', length.out = n()))

   Student Date       Grade
   <chr>   <date>     <dbl>
 1 Walton  2018-01-01  0.77
 2 Walton  2018-02-01  0.77
 3 Walton  2018-03-01  0.77
 4 Walton  2018-04-01  0.77
 5 Walton  2018-05-01  0.77
 6 Walton  2018-06-01  0.77
 7 Walton  2018-07-01  0.77
 8 Walton  2018-08-01  0.77
 9 Walton  2018-09-01  0.77
10 Walton  2018-10-01  0.77
# ... with 14 more rows

If you can manually add the date, then use:

seq(as.Date("2018-01-01"), by = '1 month', length.out = n())

If you know that there are going to be twelve observations per group, use:

seq(as.Date("2018-01-01"), as.Date("2018-12-01"))

Using lubridate, another option is to use the %m % operator:

library(lubridate)
df %>% 
   group_by(Student) %>% 
   mutate(Date = ymd(first(Date), truncated = 2) %m % months(c(0:11)))

CodePudding user response:

You can try to use data.table package :

library(data.table)
library(lubridate)
library(magrittr)
dt <- fread("
Student Date    Grade
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Walton  2018    .77
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86
Sam 2018    .86")

dt[,Month := 1:.N, by = Student] %>%
  .[, NewDate := make_date (Date, Month, 1)]

Here is the result :

> dt
    Student Date Grade Month    NewDate
 1:  Walton 2018  0.77     1 2018-01-01
 2:  Walton 2018  0.77     2 2018-02-01
 3:  Walton 2018  0.77     3 2018-03-01
 4:  Walton 2018  0.77     4 2018-04-01
 5:  Walton 2018  0.77     5 2018-05-01
 6:  Walton 2018  0.77     6 2018-06-01
 7:  Walton 2018  0.77     7 2018-07-01
 8:  Walton 2018  0.77     8 2018-08-01
 9:  Walton 2018  0.77     9 2018-09-01
10:  Walton 2018  0.77    10 2018-10-01
11:  Walton 2018  0.77    11 2018-11-01
12:  Walton 2018  0.77    12 2018-12-01
13:     Sam 2018  0.86     1 2018-01-01
14:     Sam 2018  0.86     2 2018-02-01
15:     Sam 2018  0.86     3 2018-03-01
16:     Sam 2018  0.86     4 2018-04-01
17:     Sam 2018  0.86     5 2018-05-01
18:     Sam 2018  0.86     6 2018-06-01
19:     Sam 2018  0.86     7 2018-07-01
20:     Sam 2018  0.86     8 2018-08-01
21:     Sam 2018  0.86     9 2018-09-01
22:     Sam 2018  0.86    10 2018-10-01
23:     Sam 2018  0.86    11 2018-11-01
24:     Sam 2018  0.86    12 2018-12-01
    Student Date Grade Month    NewDate

CodePudding user response:

Here is another dplyr approach:

mutate a helper column x with 1 to 12,

paste it together and wrap it around as.Date

library(dplyr)

df %>% 
  mutate(x = rep(1:12, length.out = n()),
         Date = as.Date(paste(Date,x,"01",sep = "-")), .keep="unused")

Student       Date Grade
1   Walton 2018-01-01  0.77
2   Walton 2018-02-01  0.77
3   Walton 2018-03-01  0.77
4   Walton 2018-04-01  0.77
5   Walton 2018-05-01  0.77
6   Walton 2018-06-01  0.77
7   Walton 2018-07-01  0.77
8   Walton 2018-08-01  0.77
9   Walton 2018-09-01  0.77
10  Walton 2018-10-01  0.77
11  Walton 2018-11-01  0.77
12  Walton 2018-12-01  0.77
13     Sam 2018-01-01  0.86
14     Sam 2018-02-01  0.86
15     Sam 2018-03-01  0.86
16     Sam 2018-04-01  0.86
17     Sam 2018-05-01  0.86
18     Sam 2018-06-01  0.86
19     Sam 2018-07-01  0.86
20     Sam 2018-08-01  0.86
21     Sam 2018-09-01  0.86
22     Sam 2018-10-01  0.86
23     Sam 2018-11-01  0.86
24     Sam 2018-12-01  0.86
  • Related