I would like update the names based on two columns
My example has 3 originial columns
df <- data.frame(name1 = c("a", "a", "a", "a", 'a', NA, NA, NA),
name2 = c("b", "b", "b", "b", "c", NA, NA, NA),
name3 = c("b", "b", "b", "b", "c", "a", "a", "a"))
df
name1 name2 name3
1 a b b
2 a b b
3 a b b
4 a b b
5 a c c
6 <NA> <NA> a
7 <NA> <NA> a
8 <NA> <NA> a
I would like to update column name3
(or even create a new column) saying that if name1
== a
, and name2
== NA
, then the a
character in name3
will be replaced by b
in column name2
.
My desired output something like
name1 name2 name3
1 a b b
2 a b b
3 a b b
4 a b b
5 a c c
6 <NA> <NA> b
7 <NA> <NA> b
8 <NA> <NA> b
So far, i am using this df %>% mutate(name3 = ifelse(name1 == "a" & is.na(name2), "b", name3))
, but now NA
appeared. Any suggestions for this?
CodePudding user response:
We can replace ==
with %in%
to eliminate the NAs, because R evaluates NA %in% x
to FALSE, but NA==x
to NA
df %>% mutate(name3 = ifelse(name1 %in% 'a' & is.na(name2), 'b', name3))
CodePudding user response:
Base R
df$name3 <- ifelse(any(df$name1 == "a") & is.na(df$name2), "b", df$name3)
dplyr
library(dplyr)
df %>%
mutate(name3 = case_when(
any(name1 == "a") & is.na(name2) ~ "b",
TRUE ~ name3
))
# name1 name2 name3
#1 a b b
#2 a b b
#3 a b b
#4 a b b
#5 a c c
#6 <NA> <NA> b
#7 <NA> <NA> b
#8 <NA> <NA> b
CodePudding user response:
We could use a case_when
or ifelse
statement:
library(dplyr)
df %>%
mutate(name3 = case_when(any(name1 %in% "a") &
is.na(name2) ~ "b",
TRUE ~ name3))
or:
df %>%
mutate(name3 = ifelse(any(name1 %in% "a") &
is.na(name2), "b", name3))
name1 name2 name3
1 a b b
2 a b b
3 a b b
4 a b b
5 a c c
6 <NA> <NA> b
7 <NA> <NA> b
8 <NA> <NA> b