Home > Back-end >  Select columns that contain only values from an external list
Select columns that contain only values from an external list

Time:11-03

I want to filter a data frame so that I only keep columns that have the values 1, 2, 3, 4, 5, or NA in them.

x = data.frame(col1 = c("a" , "b", "d", "e", "f", "g"), 
           col2 = c(12, 45, 235, 2134, NA, 1), 
           col3 = c(1, 2, 3, 1, 2, NA), 
           col4 = c(1, 2, 3, 4, 5, NA), 
           col5 = c(1, 2, 3, 4, 5, 6))

With this example data, I would like to return x with only col 3 and 4.

CodePudding user response:

You can use the following solution:

library(dplyr)

x %>%
  select(where(function(x) all(x %in% c(1:5, NA))))

  col3 col4
1    1    1
2    2    2
3    3    3
4    1    4
5    2    5
6   NA   NA

Or using a formula:

x %>%
  select(where(~ all(.x %in% c(1:5, NA))))

Since the discussion just heated up on this, in case you would like to know how R interprets formulas created by ~ pronounced twiddle, just wrap it inside purrr::as_mapper. This is a function R calls behind the scene when you use this syntax for an anonymous function:

as_mapper(~ all(.x %in% c(1:5, NA)))

<lambda>
function (..., .x = ..1, .y = ..2, . = ..1) 
all(.x %in% c(1:5, NA))
attr(,"class")
[1] "rlang_lambda_function" "function" 

Here .x argument is equivalent to the first argument of our anonymous function.

  • Related