How can I keep columns that contain a certain string value within their body?-CodePudding

So I've been trying to figure out how to subset a dataframe where if there is a specific string within a column, it keeps those columns and drops all others. In this case, I'm searching for 'other' and would like to go from this:

A	B	C	D
other	one	two	three
one	other	two	three
two	three	one	other

to this:

A	B	D
other	one	three
one	other	three
two	three	other

I know how to filter by using the column names, but not on what is included within their cells. Is there a neat way of doing this?

CodePudding user response：

Using tidyverse you can do:

library(tidyverse)

d <- read.table(text = "A   B   C   D
other   one two three
one other   two three
two three   one other", header = TRUE)

d %>%
  select_if(~any(.x == "other"))

# A     B     D
# 1 other   one three
# 2   one other three
# 3   two three other

CodePudding user response：

An alternative using map_lgl from purrr package.

map_lgl loops through every column applying the lambda function passed as the formula: ~ any(. == 'other')).

~ is short notation for function(.x) and . represents .x.

The output will be a logical vector that we can use to subset df.

library(tidyverse)

df <- read.table(text = "A   B   C   D
other   one two three
one other   two three
two three   one other", header = TRUE)

df[, map_lgl(df, ~ any(. == 'other'))]
#>       A     B     D
#> 1 other   one three
#> 2   one other three
#> 3   two three other

^{Created on 2021-11-22 by the reprex package (v2.0.1)}