Home > database >  Create new column based on previous column by group; if missing, use NA
Create new column based on previous column by group; if missing, use NA

Time:08-26

I am trying out to select a value by group from one column, and pass it as value in another column, extending for the whole group. This is similar to question asked here . BUt, some groups do not have this number: in that case, I need to fill the column with NAs. How to do this?

Dummy example:

dd1 <- data.frame(type = c(1,1,1),
                 grp = c('a', 'b', 'd'),
                 val = c(1,2,3))
dd2 <- data.frame(type = c(2,2),
                 grp = c('a', 'b'),
                 val = c(8,2))
dd3 <- data.frame(type = c(3,3),
                  grp = c('b', 'd'),
                  val = c(7,4))

dd <- rbind(dd1, dd2, dd3)

Create new column:

dd %>% 
  group_by(type) %>% 
  mutate(#val_a = ifelse(grp == 'a', val , NA),
         val_a2 = val[grp == 'a'])

Expected outcome:

  type grp val val_a  # pass in `val_a` value of teh group 'a'
1    1   a   1  1
2    1   b   2  1
3    1   d   3  1
4    2   a   8  8
5    2   b   2  8
6    3   b   7 NA
7    3   d   4 NA     # value for 'a' is missing from group 3

CodePudding user response:

You were close with your first approach; use any to apply the condition to all observations in the group:

dd %>% 
  group_by(type) %>% 
  mutate(val_a = ifelse(any(grp == "a"), val[grp == "a"] , NA))

  type grp     val val_a
  <dbl> <chr> <dbl> <dbl>
1     1 a         1     1
2     1 b         2     1
3     1 d         3     1
4     2 a         8     8
5     2 b         2     8
6     3 b         7    NA
7     3 d         4    NA

CodePudding user response:

Try this:

dd %>% 
  group_by(type) %>% 
  mutate(val_a2 = val[which(c(grp == 'a'))[1]])
# # A tibble: 7 x 4
# # Groups:   type [3]
#    type grp     val val_a2
#   <dbl> <chr> <dbl>  <dbl>
# 1     1 a         1      1
# 2     1 b         2      1
# 3     1 d         3      1
# 4     2 a         8      8
# 5     2 b         2      8
# 6     3 b         7     NA
# 7     3 d         4     NA

This also controls against the possibility that there could be more than one match, which may cause bad results (with or without a warning).

  • Related