Home > database >  how to replace age with previous value 1
how to replace age with previous value 1

Time:12-01

I'm trying to replace missing age values in one wave by adding 1 to the value from the previous wave. So, for instance:

ID Age Wave
1 20 1
1 NA 2
2 61 1
2 NA 2

would become

ID Age Wave
1 20 1
1 21 2
2 61 1
2 62 2

CodePudding user response:

library(tidyverse)

df %>% 
  mutate(Age = case_when(is.na(Age) ~ lag(Age)   1, 
                         TRUE ~ Age))

# A tibble: 4 x 3
     ID   Age  Wave
  <dbl> <dbl> <dbl>
1     1    20     1
2     1    21     2
3     2    61     1
4     2    62     2

CodePudding user response:

Base R

> ave(df$Age,df$ID,FUN=function(x){x[1] seq_along(x)-1})
[1] 20 21 61 62

CodePudding user response:

With tidyverse, assuming your data is in df dataframe:

library(tidyverse)

df %>% 
    group_by(ID) %>% arrange(ID, Wave) %>%
    mutate(missing_grp = cumsum( (is.na(Age)!=is.na(lag(Age))) | !is.na(Age) )) %>%
    group_by(ID, missing_grp) %>%
    mutate(age_offset=cumsum(is.na(Age))) %>%
    group_by(ID) %>%
    fill(Age, .direction='down') %>%
    mutate(Age = Age   age_offset) %>%
    ungroup() %>% select(-missing_grp, -age_offset)

It works also with multiple successive missing ages. For the following input:

df <- tribble(
    ~ID, ~Age, ~Wave,
    1,   21,    1,
    1,   NA,    2,
    2,   61,    1,
    2,   NA,    2,
    2,   NA,    3,
    2,   70,    4,
    2,   NA,    5,
    )

it returns:

# A tibble: 7 × 3
     ID   Age  Wave
  <dbl> <dbl> <dbl>
1     1    21     1
2     1    22     2
3     2    61     1
4     2    62     2
5     2    63     3
6     2    70     4
7     2    71     5

CodePudding user response:

In base R

within(df, Age[is.na(Age)] <- Age[which(is.na(Age)) - 1]   1)
#>   ID Age Wave
#> 1  1  20    1
#> 2  1  21    2
#> 3  2  61    1
#> 4  2  62    2

CodePudding user response:

If you have more than two waves, we could use the row number:

library(dplyr)
library(tidyverse)

df |>
  group_by(ID) |>
  fill(Age) |>
  mutate(Age = Age   row_number() - 1) |>
  ungroup()

Output:

# A tibble: 5 × 3
     ID   Age  Wave
  <dbl> <dbl> <dbl>
1     1    21     1
2     1    22     2
3     2    61     1
4     2    62     2
5     2    63     3
  •  Tags:  
  • r
  • Related