Home > Enterprise >  How to loop over observations in R?
How to loop over observations in R?

Time:07-15

I have the following dataframe, representing a subject's ID, the number of months before a follow up, and the subject's age.

df1<-structure(list(USUBJID = c(1, 2, 3), 
                    follow_up = c(24,36,56), 
                    AGE = c(65,34,65)), 
               row.names = c(NA, -3L), 
               class = c("tbl_df", "tbl", "data.frame"))
# A tibble: 3 x 3
  USUBJID follow_up   AGE
    <dbl>     <dbl> <dbl>
1       1        24    65
2       2        36    34
3       3        56    65

For each subject, I need to create annual entries based on the value of the follow up column (e.g. if the follow up is 36 months I need entries for 0, 12, 24, and 36 months.) For each of these entries I also need to calculate the subject's age, adding to the value from the original age column.

This is my desired output:

# A tibble: 12 x 3
   USUBJID Month   AGE
     <dbl> <dbl> <dbl>
 1       1     0    65
 2       1    12    66
 3       1    24    67
 4       2     0    34
 5       2    12    35
 6       2    24    36
 7       2    36    37
 8       3     0    65
 9       3    12    66
10       3    24    67
11       3    36    68
12       3    48    69

CodePudding user response:

Not very clear about the conditions. This may help - replicate the rows (uncount from tidyr) based on the value in 'follow_up' column, grouped by 'USUBJID', create a sequence from 0 with increments of 12 and 'AGE' incrememented by 1 (using row_number as sequence)

library(dplyr)
library(tidyr)
df2 <- df1 %>% 
 uncount(follow_up %/% 12   1) %>% 
 group_by(USUBJID) %>%
 mutate(follow_up = seq(0, length.out = n(), by = 12), 
        AGE = first(AGE)   row_number() - 1) %>% 
 ungroup %>%
 rename(Month = follow_up)

-output

df2
# A tibble: 12 × 3
   USUBJID     Month   AGE
     <int>     <dbl> <dbl>
 1       1         0    65
 2       1        12    66
 3       1        24    67
 4       2         0    34
 5       2        12    35
 6       2        24    36
 7       2        36    37
 8       3         0    65
 9       3        12    66
10       3        24    67
11       3        36    68
12       3        48    69

Or using data.table

library(data.table)
setDT(df1)[rep(seq_len(.N), follow_up %/% 12   1)][,
  .(Month = seq(0, length.out = .N, by = 12), 
  AGE = first(AGE)   seq_len(.N) - 1), .(USUBJID)]

-output

    USUBJID Month   AGE
      <num> <num> <num>
 1:       1     0    65
 2:       1    12    66
 3:       1    24    67
 4:       2     0    34
 5:       2    12    35
 6:       2    24    36
 7:       2    36    37
 8:       3     0    65
 9:       3    12    66
10:       3    24    67
11:       3    36    68
12:       3    48    69

data

df1 <- structure(list(USUBJID = 1:3, follow_up = c(24, 36, 56), AGE = c(65, 
34, 65)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L))
  •  Tags:  
  • r
  • Related