You have a dataframe that you're analysing, and it seems like you have three or more columns that are identical. But, how can you tell? This is a problem I face frequently and I haven't found a fast tidyverse solution when checking more than two columns.
If you are comparing two columns, you can use:
mutate(is_equal = col_1 == col_2)
.
But you cannot do:
mutate(is_equal = col_1 == col_2 == col_3)
Reprex:
structure(list(col_1 = c(109, 109, 109, 109, 109, 109, 109, 109,
109, 109, 109, 109, 77, 77, 78, 77), col_2 = c(109, 109, 109,
109, 109, 109, 109, 109, 109, 109, 109, 109, 77, 77, 77, 77),
col_3 = c(109, 109, 109, 109, 109, 109, 109, 109, 109, 109,
109, 109, 77, 77, 77, 77)), row.names = c(NA, -16L), class = c("tbl_df",
"tbl", "data.frame"))
CodePudding user response:
We could use if_all
library(dplyr)
df1 %>%
mutate(is_equal = if_all(col_2:col_3, `==`, col_1))
-output
# A tibble: 16 × 4
col_1 col_2 col_3 is_equal
<dbl> <dbl> <dbl> <lgl>
1 109 109 109 TRUE
2 109 109 109 TRUE
3 109 109 109 TRUE
4 109 109 109 TRUE
5 109 109 109 TRUE
6 109 109 109 TRUE
7 109 109 109 TRUE
8 109 109 109 TRUE
9 109 109 109 TRUE
10 109 109 109 TRUE
11 109 109 109 TRUE
12 109 109 109 TRUE
13 77 77 77 TRUE
14 77 77 77 TRUE
15 78 77 77 FALSE
CodePudding user response:
We can use the combination of rowwise()
and c_across()
to check the columns.
library(dplyr)
df %>%
rowwise() %>%
# or mutate(is_equal = length(unique(c_across(everything()))) == 1)
mutate(is_equal = n_distinct(c_across(everything())) == 1) %>%
ungroup()
# A tibble: 16 × 4
col_1 col_2 col_3 is_equal
<dbl> <dbl> <dbl> <lgl>
1 109 109 109 TRUE
2 109 109 109 TRUE
3 109 109 109 TRUE
4 109 109 109 TRUE
5 109 109 109 TRUE
6 109 109 109 TRUE
7 109 109 109 TRUE
8 109 109 109 TRUE
9 109 109 109 TRUE
10 109 109 109 TRUE
11 109 109 109 TRUE
12 109 109 109 TRUE
13 77 77 77 TRUE
14 77 77 77 TRUE
15 78 77 77 FALSE
16 77 77 77 TRUE
CodePudding user response:
You can use var
to test equality among multiple elements (if var(x) == 0
, then all elements are equal).
Another option is to use all
.
df %>%
rowwise() %>%
mutate(equal = var(c_across(col_1:col_3)) == 0,
equal2 = all(c_across(col_1:col_3) == col_1))
# A tibble: 16 × 5
# Rowwise:
col_1 col_2 col_3 equal equal2
<dbl> <dbl> <dbl> <lgl> <lgl>
1 109 109 109 TRUE TRUE
2 109 109 109 TRUE TRUE
3 109 109 109 TRUE TRUE
4 109 109 109 TRUE TRUE
5 109 109 109 TRUE TRUE
6 109 109 109 TRUE TRUE
7 109 109 109 TRUE TRUE
8 109 109 109 TRUE TRUE
9 109 109 109 TRUE TRUE
10 109 109 109 TRUE TRUE
11 109 109 109 TRUE TRUE
12 109 109 109 TRUE TRUE
13 77 77 77 TRUE TRUE
14 77 77 77 TRUE TRUE
15 78 77 77 FALSE FALSE
16 77 77 77 TRUE TRUE