Home > Enterprise >  Pulling values from another data frame based on two columns in an ifelse function
Pulling values from another data frame based on two columns in an ifelse function

Time:10-17

I have two data frames, namely a and b:

a
---------------
id  age group age_diff
1   10    x      2
1   11    y      1
2   4     x      NA
2   5     y      NA
3   9     x      NA
3   11    y      NA
---------------

and

b:
--------------
id group base_age
1    x      12
1    y      12
2    x      15
2    y      13
3    x      20
3    y      22

I need to fill NA in the data frame a with the difference between the age in a and base_age in b of the corresponding id and group. For example, for id=2, group=x in a, age_diff needs to be 15-4 = 11.

Desired Result:

a
---------------
id  age group age_diff
1   10    x      2
1   11    y      1
2   4     x      11
2   5     y      8
3   9     x      11
3   11    y      11
---------------

I have

a <- a %>% mutate(age_diff = ifelse(is.na(age_diff), --- , age_diff))

but I'm not sure what to put in the ---. Any inputs would be very appreciated.

CodePudding user response:

new coloumn is the result.

df1 %>% mutate(new = if_else(is.na(df1$age_diff), (df2$base_age - df1$age), df1$age_diff))
  id age group age_diff new
1  1  10     x        2   2
2  1  11     y        1   1
3  2   4     x       NA  11
4  2   5     y       NA   8
5  3   9     x       NA  11
6  3  11     y       NA  11

Data:

df1 = structure(list(id = c(1, 1, 2, 2, 3, 3), age = c(10, 11, 4, 5, 
9, 11), group = c("x", "y", "x", "y", "x", "y"), age_diff = c(2, 
1, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-6L))

df2 = structure(list(id = c(1, 1, 2, 2, 3, 3), group = c("x", "y", 
"x", "y", "x", "y"), base_age = c(12, 12, 15, 13, 20, 22)), class = "data.frame", row.names = c(NA, 
-6L))

CodePudding user response:

Does this work? The difference will hold good regardless of na in dataframe a.

library(dplyr)
a %>% inner_join(b, by = c('id','group')) %>% 
    mutate(age_diff = base_age - age) %>% select(-base_age)
  id age group age_diff
1  1  10     x        2
2  1  11     y        1
3  2   4     x       11
4  2   5     y        8
5  3   9     x       11
6  3  11     y       11

Data used:

a
  id age group age_diff
1  1  10     x        2
2  1  11     y        1
3  2   4     x       NA
4  2   5     y       NA
5  3   9     x       NA
6  3  11     y       NA

b
  id group base_age
1  1     x       12
2  1     y       12
3  2     x       15
4  2     y       13
5  3     x       20
6  3     y       22
  • Related