Home > Software design >  Find unique values in a column minus values that are in vector
Find unique values in a column minus values that are in vector

Time:07-05

I'd like to find the unique values of a column, but take away values that are in specified vectors. In the example data below I'd like to find the unique values from the column all_areas minus the values in the vectors area1 and area2. i.e. the result should be "town", "city", "village"

set.seed(1)
area_df = data.frame(all_areas = sample(rep(c("foo", "bar", "big", "small", "town", "city", "village"),5),20),
                    number =  sample(1:100, 20))

area1 = c("foo", "bar")
area2 = c("big", "small")

CodePudding user response:

You could use the function setdiff to find the set difference between all_areas and area1 and area2 combined:

setdiff(area_df$all_areas, c(area1, area2))

[1] "city" "village" "town"   

CodePudding user response:

We may use %in% to create a logical vector, negate (!) to subset the other elements from 'all_areas' and then return the unique rows with unique

unique(subset(area_df, !all_areas %in% c(area1, area2)))

-output

   all_areas number
5    village     44
7       city     33
8       town     84
9       city     35
10   village     70
11      town     74
16   village     87
19      town     40
20   village     93

CodePudding user response:

With a dplyr approach:

library(dplyr)

area_df %>% 
  filter(!all_areas %in% c(area1, area2)) %>% 
  distinct

#>   all_areas number
#> 1   village     44
#> 2      city     33
#> 3      town     84
#> 4      city     35
#> 5   village     70
#> 6      town     74
#> 7   village     87
#> 8      town     40
#> 9   village     93
  • Related