I am using the dplyr package in R to test equality of two columns using the code below. The results work well except for missing values where neither TRUE nor FALSE is returned
mutate(check = if_else(only == count, TRUE, FALSE))
Any ideas on how I can tweak this syntax?
Thanks in advance!
CodePudding user response:
Is this what you are looking for?
library(dplyr)
dat <- data.frame(only = c(1, NA, 2, 3, NA),
count = c(1, NA, 3, 2, 1))
dat %>%
mutate(check = if_else(only == count | is.na(only) & is.na(count),
TRUE, FALSE, missing = FALSE))
only count check
1 1 1 TRUE
2 NA NA TRUE
3 2 3 FALSE
4 3 2 FALSE
5 NA 1 FALSE
CodePudding user response:
mutate(check = if_else(only == count |
is.na(only) == is.na(count), TRUE, FALSE))
CodePudding user response:
You can try case_when
. The following code means if not col1 == col2
, which includes col1 != col2
or columns are NA
, it will return FALSE
.
# Example data frame
dat <- data.frame(
col1 = c(1, 2, 3, 4, 5),
col2 = c(1, 2, NA, 4, 6)
)
library(dplyr)
dat2 <- dat %>%
mutate(check = case_when(
col1 == col2 ~TRUE,
TRUE ~FALSE
))
print(dat2)
# col1 col2 check
# 1 1 1 TRUE
# 2 2 2 TRUE
# 3 3 NA FALSE
# 4 4 4 TRUE
# 5 5 6 FALSE