Home > OS >  Run ifelse x times with one call (quasi iteratively until condition is fulfilled)
Run ifelse x times with one call (quasi iteratively until condition is fulfilled)

Time:12-09

This question is related to this Is there an R function for imputing missing year values, consecutively, by group?:

The OP asks for this dataframe to impute the missing year values by group: (There are already sufficient answers)

df <- data.frame(ID=c("A", "A", "A", "A", 
                      "B", "B", "B", "B",
                      "C", "C", "C", "C",
                      "D", "D", "D", "D"),
                 grade=c("KG", "01", "02", "03",
                         "KG", "01", "02", "03",
                         "KG", "01", "02", "03",
                         "KG", "01", "02", "03"),
                 year=c(2002, 2003, NA, 2005,
                        2007, NA, NA, 2010,
                        NA, 2005, 2006, NA,
                        2009, 2010, NA, NA))

I tried to use ifelse with lag() or lead():

The idea in words: If the row is NA then take the row above and add 1. This works fine if there is only one NA row in the group. If there are 2 consecutive NAs then it is getting clumsy.

My question is how can I make ifelse run with one call until all NA's are replaced:

My try:

library(dplyr)
df %>% 
  group_by(ID) %>% 
  mutate(year= ifelse(is.na(year), lag(year) 1, year),
         year= ifelse(is.na(year), lag(year) 1, year),
         year= ifelse(is.na(year), lead(year)-1, year))

gives:

   ID    grade  year
   <chr> <chr> <dbl>
 1 A     KG     2002
 2 A     01     2003
 3 A     02     2004
 4 A     03     2005
 5 B     KG     2007
 6 B     01     2008
 7 B     02     2009
 8 B     03     2010
 9 C     KG     2004
10 C     01     2005
11 C     02     2006
12 C     03     2007
13 D     KG     2009
14 D     01     2010
15 D     02     2011
16 D     03     2012

CodePudding user response:

We may use accumulate from purrr

library(dplyr)
library(purrr)
df %>% 
   group_by(ID) %>% 
   mutate(year = accumulate(accumulate(year, 
      ~ if(is.na(.y)) .x   1 else .y),
       ~ if(is.na(.x)) .y - 1 else .x, .dir = "backward")) %>% 
   ungroup

-output

# A tibble: 16 × 3
   ID    grade  year
   <chr> <chr> <dbl>
 1 A     KG     2002
 2 A     01     2003
 3 A     02     2004
 4 A     03     2005
 5 B     KG     2007
 6 B     01     2008
 7 B     02     2009
 8 B     03     2010
 9 C     KG     2004
10 C     01     2005
11 C     02     2006
12 C     03     2007
13 D     KG     2009
14 D     01     2010
15 D     02     2011
16 D     03     2012
  • Related