I'd like to find the unique values of a column, but take away values that are in specified vectors. In the example data below I'd like to find the unique values from the column all_areas
minus the values in the vectors area1
and area2
.
i.e. the result should be "town", "city", "village"
set.seed(1)
area_df = data.frame(all_areas = sample(rep(c("foo", "bar", "big", "small", "town", "city", "village"),5),20),
number = sample(1:100, 20))
area1 = c("foo", "bar")
area2 = c("big", "small")
CodePudding user response:
You could use the function setdiff
to find the set difference between all_areas
and area1
and area2
combined:
setdiff(area_df$all_areas, c(area1, area2))
[1] "city" "village" "town"
CodePudding user response:
We may use %in%
to create a logical vector, negate (!
) to subset
the other elements from 'all_areas' and then return the unique rows with unique
unique(subset(area_df, !all_areas %in% c(area1, area2)))
-output
all_areas number
5 village 44
7 city 33
8 town 84
9 city 35
10 village 70
11 town 74
16 village 87
19 town 40
20 village 93
CodePudding user response:
With a dplyr
approach:
library(dplyr)
area_df %>%
filter(!all_areas %in% c(area1, area2)) %>%
distinct
#> all_areas number
#> 1 village 44
#> 2 city 33
#> 3 town 84
#> 4 city 35
#> 5 village 70
#> 6 town 74
#> 7 village 87
#> 8 town 40
#> 9 village 93