Home > Net >  Predicate function to check that values across >3 columns are all equal
Predicate function to check that values across >3 columns are all equal

Time:09-03

You have a dataframe that you're analysing, and it seems like you have three or more columns that are identical. But, how can you tell? This is a problem I face frequently and I haven't found a fast tidyverse solution when checking more than two columns.

If you are comparing two columns, you can use: mutate(is_equal = col_1 == col_2).

But you cannot do: mutate(is_equal = col_1 == col_2 == col_3)

Reprex:

structure(list(col_1 = c(109, 109, 109, 109, 109, 109, 109, 109, 
109, 109, 109, 109, 77, 77, 78, 77), col_2 = c(109, 109, 109, 
109, 109, 109, 109, 109, 109, 109, 109, 109, 77, 77, 77, 77), 
    col_3 = c(109, 109, 109, 109, 109, 109, 109, 109, 109, 109, 
    109, 109, 77, 77, 77, 77)), row.names = c(NA, -16L), class = c("tbl_df", 
"tbl", "data.frame"))

CodePudding user response:

We could use if_all

library(dplyr)
df1 %>% 
   mutate(is_equal = if_all(col_2:col_3, `==`, col_1))

-output

# A tibble: 16 × 4
   col_1 col_2 col_3 is_equal
   <dbl> <dbl> <dbl> <lgl>   
 1   109   109   109 TRUE    
 2   109   109   109 TRUE    
 3   109   109   109 TRUE    
 4   109   109   109 TRUE    
 5   109   109   109 TRUE    
 6   109   109   109 TRUE    
 7   109   109   109 TRUE    
 8   109   109   109 TRUE    
 9   109   109   109 TRUE    
10   109   109   109 TRUE    
11   109   109   109 TRUE    
12   109   109   109 TRUE    
13    77    77    77 TRUE    
14    77    77    77 TRUE    
15    78    77    77 FALSE   

CodePudding user response:

We can use the combination of rowwise() and c_across() to check the columns.

library(dplyr)

df %>% 
  rowwise() %>% 
  # or mutate(is_equal = length(unique(c_across(everything()))) == 1)
  mutate(is_equal = n_distinct(c_across(everything())) == 1) %>% 
  ungroup()

# A tibble: 16 × 4
   col_1 col_2 col_3 is_equal
   <dbl> <dbl> <dbl> <lgl>   
 1   109   109   109 TRUE    
 2   109   109   109 TRUE    
 3   109   109   109 TRUE    
 4   109   109   109 TRUE    
 5   109   109   109 TRUE    
 6   109   109   109 TRUE    
 7   109   109   109 TRUE    
 8   109   109   109 TRUE    
 9   109   109   109 TRUE    
10   109   109   109 TRUE    
11   109   109   109 TRUE    
12   109   109   109 TRUE    
13    77    77    77 TRUE    
14    77    77    77 TRUE    
15    78    77    77 FALSE   
16    77    77    77 TRUE    

CodePudding user response:

You can use var to test equality among multiple elements (if var(x) == 0, then all elements are equal).

Another option is to use all.

df %>% 
  rowwise() %>% 
  mutate(equal = var(c_across(col_1:col_3)) == 0,
         equal2 = all(c_across(col_1:col_3) == col_1))
# A tibble: 16 × 5
# Rowwise: 
   col_1 col_2 col_3 equal equal2
   <dbl> <dbl> <dbl> <lgl> <lgl> 
 1   109   109   109 TRUE  TRUE  
 2   109   109   109 TRUE  TRUE  
 3   109   109   109 TRUE  TRUE  
 4   109   109   109 TRUE  TRUE  
 5   109   109   109 TRUE  TRUE  
 6   109   109   109 TRUE  TRUE  
 7   109   109   109 TRUE  TRUE  
 8   109   109   109 TRUE  TRUE  
 9   109   109   109 TRUE  TRUE  
10   109   109   109 TRUE  TRUE  
11   109   109   109 TRUE  TRUE  
12   109   109   109 TRUE  TRUE  
13    77    77    77 TRUE  TRUE  
14    77    77    77 TRUE  TRUE  
15    78    77    77 FALSE FALSE 
16    77    77    77 TRUE  TRUE  
  • Related