I want to filter df
tibble using the variable s2
with values 2 and 3. I get a new tibble df2
that keeps showing also the value 1 of s2.
How can I create a new tibble with only the filtered values of df
?
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(labelled)
df <- tibble(s1 = c("M", "M", "F", "M", "M", "F"),
s2 = c(1, 1, 2, 1, 1, 3)) %>%
set_variable_labels(s1 = "Sex", s2 = "Question") %>%
set_value_labels(s1 = c(Male = "M", Female = "F"), s2 = c(Yes = 1, No = 2, DK =3))
df2 <- df %>% filter(s2 %in% c("2", "3"))
df2$s2
#> <labelled<double>[2]>: Question
#> [1] 2 3
#>
#> Labels:
#> value label
#> 1 Yes
#> 2 No
#> 3 DK
Created on 2022-10-12 with reprex v2.0.2
CodePudding user response:
Your new tibble df2
does only contain the filtered values. In your results it shows
#> [1] 2 3
which are the filtered results you want.
The extra detail printed are attributes associated with the data. They show the defined labels for s2
but the data you have filtered does not have every label here.
#> Labels:
#> value label
#> 1 Yes
#> 2 No
#> 3 DK
Edit based on update from OP:
To filter the label attributes based on the filtered data, I think you need
df2 <- df %>% filter(s2 %in% c("2", "3")) %>% drop_unused_value_labels()
df2$s2
#> <labelled<double>[2]>: Question
#> [1] 2 3
#>
#> Labels:
#> value label
#> 2 No
#> 3 DK