Home > Software engineering >  mutate a new column with rows from another column based on a condition
mutate a new column with rows from another column based on a condition

Time:11-19

I have the following dataframe

> df
     group val               divisor denom
1  group A   5               group B    NA
2  group A  10               group B    NA
3  group A  12               group B    NA
4  group B   2               group D    NA
5  group B   5               group D    NA
6  group B   3               group D    NA
7  group C   1 need to be determined    NA
8  group C   3 need to be determined    NA
9  group C   5 need to be determined    NA
10 group D   2                 total    10
11 group D   3                 total    10
12 group D  11                 total    10

Required output

     group val               divisor denom
1  group A   5               group B     2
2  group A  10               group B     5
3  group A  12               group B     3
4  group B   2               group D     2
5  group B   5               group D     3
6  group B   3               group D    11
7  group C   1 need to be determined    NA
8  group C   3 need to be determined    NA
9  group C   5 need to be determined    NA
10 group D   2                 total    10
11 group D   3                 total    10
12 group D  11                 total    10

Tried the following

df_org %>%
  dplyr::mutate(denom = ifelse(
    divisor %in% "total" , 10, denom
  )) %>%
  dplyr::mutate(denom = case_when(
    divisor %in% "group B" ~ val[group == "group B"] 
  ))

I get an error,

Error in `dplyr::mutate()`:
! Problem while computing `denom = case_when(divisor %in% "group
  B" ~ val[group == "group B"])`.
Caused by error in `case_when()`:
! `divisor %in% "group B" ~ val[group == "group B"]` must be
  length 12 or one, not 3.

data

> dput(df_org)
structure(list(group = c("group A", "group A", "group A", "group B", 
"group B", "group B", "group C", "group C", "group C", "group D", 
"group D", "group D"), val = c(5L, 10L, 12L, 2L, 5L, 3L, 1L, 
3L, 5L, 2L, 3L, 11L), divisor = c("group B", "group B", "group B", 
"group D", "group D", "group D", "need to be determined", "need to be determined", 
"need to be determined", "total", "total", "total"), denom = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, 10L, 10L, 10L)), class = "data.frame", row.names = c(NA, 
-12L))

CodePudding user response:

case_when requires all arguments to be of same length. Here, replace could be used (assuming the number of 'group B' elements in divisor and group are the same

library(dplyr)
df_org %>%
  dplyr::ungroup() %>%
  dplyr::mutate(denom = ifelse(
    divisor %in% "total" , 10, denom
  )) %>%
    dplyr::mutate(denom = replace(denom, divisor %in% "group B", 
        val[group == "group B"]),
   denom = replace(denom, divisor %in% "group D",
     val[group == "group D"]))

-output

       group val               divisor denom
1  group A   5               group B     2
2  group A  10               group B     5
3  group A  12               group B     3
4  group B   2               group D     2
5  group B   5               group D     3
6  group B   3               group D    11
7  group C   1 need to be determined    NA
8  group C   3 need to be determined    NA
9  group C   5 need to be determined    NA
10 group D   2                 total    10
11 group D   3                 total    10
12 group D  11                 total    10

Or another option is to nest and match

library(dplyr)
library(tidyr)
df_org %>%
    dplyr::ungroup() %>%
    select(-denom) %>%
     rename(denom = val) %>% 
     nest(data = c(denom)) %>% 
     mutate(ind = match(divisor, group), 
       data = coalesce(data[ind], data), ind = NULL) %>% 
   unnest(data) %>% 
   mutate(denom = case_when(divisor %in% c("need to be determined", 
     "total" ) ~ df_org$denom, TRUE ~ denom), val = df_org$val, 
    .before = 2)

-output

# A tibble: 12 × 4
   group     val divisor               denom
   <chr>   <int> <chr>                 <int>
 1 group A     5 group B                   2
 2 group A    10 group B                   5
 3 group A    12 group B                   3
 4 group B     2 group D                   2
 5 group B     5 group D                   3
 6 group B     3 group D                  11
 7 group C     1 need to be determined    NA
 8 group C     3 need to be determined    NA
 9 group C     5 need to be determined    NA
10 group D     2 total                    10
11 group D     3 total                    10
12 group D    11 total                    10
  • Related