I am trying to create a new column based on specific values in several (four) other columns.
Reprex:
col1 <- c("a", "", "a")
col2 <- c("", "b", "b")
df <- data.frame(col1, col2)
I have tried with case_when, but cannot get it to apply to all values.
df$col3 <- with(df, case_when(col1 %in% c("a") & col2 %in% c("b") ~ "c" ))
What I want to achieve:
| col1| col2 | col3 |
|:--: | :-------:|:-------:|
| a | | a |
| | b | b |
| a | b | c |
So if col1 is a, col2 is na then col 3 is a. If col1 is na, col2 b then col3 should be b and finally if col1 is a, col2 b then col3 should be c
CodePudding user response:
Here is how we could do it with case_when
, first just replace blank cells with NA
library(dplyr)
df %>%
mutate(across(starts_with("col"), na_if,""),
col3 = case_when(
col1=="a" & is.na(col2) ~ "a",
is.na(col1) & col2 == "b" ~ "b",
col1 == "a" & col2 == "b" ~ "c",
TRUE ~ NA_character_))
col1 col2 col3
1 a <NA> a
2 <NA> b b
3 a b c
CodePudding user response:
The values are blank and not NA
library(dplyr)
df %>%
mutate(col3 = case_when(col1=='a' & col2 == 'b' ~ "c",
TRUE ~ coalesce(na_if(col1, ""), na_if(col2, ""))))
-output
col1 col2 col3
1 a a
2 b b
3 a b c
Or use unite
and then replace
library(tidyr)
df %>%
unite(col3, col1, col2, sep="", remove = FALSE) %>%
mutate(col3 = replace(col3, col3 == "ab", "c")) %>%
relocate(col3, .after = last_col())
col1 col2 col3
1 a a
2 b b
3 a b c