Home > Back-end >  Subsetting a column where all values are the same character value in r
Subsetting a column where all values are the same character value in r

Time:02-18

I am trying to identify data frame columns where the columns have a single character value tree.

Here is an example dataset.

df <- data.frame(id = c(1,2,3,4,5),
                 var.1 = c(5,6,7,"tree",4),
                 var.2 = c("tree","tree","tree","tree","tree"),
                 var.3 = c(4,5,8,9,1))

> df
  id var.1 var.2 var.3
1  1     5  tree     4
2  2     6  tree     5
3  3     7  tree     8
4  4  tree  tree     9
5  5     4  tree     1

I would flag the Var.2 variable since it has all "tree values in it.

flagged [1] "var.2"

Any ideas? Thanks!

CodePudding user response:

Using dplyr, you could do

flagged <- df %>%
  select(where(~n_distinct(.x) == 1 && unique(.x) == "tree")) %>%
  names()

where you select all columns that only have one distinct value which equals "tree", and then extract the column names.

CodePudding user response:

For each column, check if all elements equal the first element.

df <- data.frame(id = c(1,2,3,4,5),
                 var.1 = c(5,6,7,"tree",4),
                 var.2 = c("tree","tree","tree","tree","tree"),
                 var.3 = c(4,5,8,9,1))


names(df)[sapply(df, function(x) all(x == x[1]))]
#> [1] "var.2"

Created on 2022-02-17 by the reprex package (v2.0.1)

  • Related