Home > Software design >  How can I keep columns that contain a certain string value within their body?
How can I keep columns that contain a certain string value within their body?

Time:11-23

So I've been trying to figure out how to subset a dataframe where if there is a specific string within a column, it keeps those columns and drops all others. In this case, I'm searching for 'other' and would like to go from this:

A B C D
other one two three
one other two three
two three one other

to this:

A B D
other one three
one other three
two three other

I know how to filter by using the column names, but not on what is included within their cells. Is there a neat way of doing this?

CodePudding user response:

Using tidyverse you can do:

library(tidyverse)

d <- read.table(text = "A   B   C   D
other   one two three
one other   two three
two three   one other", header = TRUE)

d %>%
  select_if(~any(.x == "other"))

# A     B     D
# 1 other   one three
# 2   one other three
# 3   two three other

CodePudding user response:

An alternative using map_lgl from purrr package.

map_lgl loops through every column applying the lambda function passed as the formula: ~ any(. == 'other')).

~ is short notation for function(.x) and . represents .x.

The output will be a logical vector that we can use to subset df.

library(tidyverse)

df <- read.table(text = "A   B   C   D
other   one two three
one other   two three
two three   one other", header = TRUE)

df[, map_lgl(df, ~ any(. == 'other'))]
#>       A     B     D
#> 1 other   one three
#> 2   one other three
#> 3   two three other

Created on 2021-11-22 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related