I'd like to be able to select out data from my data.frame simply and elegantly, but I'm new to R.
This worked:
SchIndxRead %>% select(,.DormList) %>% filter(SchIndxRead$.College.Lookup=="MIAD")
I tried using this:
SchIndxRead[SchIndxRead$.College.Lookup=='MIAD',".DormList"]
And expected just "Two50Two"
but got this result:
> [1] "Two50Two" NA NA NA NA
> [6] NA NA NA NA NA
> [11] NA NA NA NA NA
> [16] NA NA NA NA NA
> [21] NA NA NA NA NA
CodePudding user response:
Your column .College.Lookup
probably has NA
values, such that the expression SchIndxRead$.College.Lookup=="MIAD"
returns TRUE
's and FALSE
's, but also NA
's.
When you try to subset a variable with a vector that contains NA
's, the result will also have NA
's:
set.seed(10)
df = tibble(a = 1:10, b = sample(c(0, 1, NA), 10, TRUE))
> df
# A tibble: 10 × 2
a b
<int> <dbl>
1 1 NA
2 2 0
3 3 1
4 4 NA
5 5 1
6 6 NA
7 7 NA
8 8 NA
9 9 NA
10 10 NA
> df$b == 1
[1] NA FALSE TRUE NA TRUE NA NA NA NA NA
> df[df$b == 1, "a"]
# A tibble: 9 × 1
a
<int>
1 NA
2 3
3 NA
4 5
5 NA
6 NA
7 NA
8 NA
9 NA
That's why there were NA
's in your second attempt.
But dplyr::filter
"ignores" NA
's, that is, it filters out rows where the condition returns FALSE
or NA
. That's why there weren't NA
's in your first attempt.
Two hints to improve your code:
- It would've been better to change the order of
select
andfilter
:
SchIndxRead %>% filter(.College.Lookup == "MIAD") %>% select(.DormList)
This way you don't have to add the SchIndxRead$
later.
- You might prefer using
pull()
:
SchIndxRead %>% filter(.College.Lookup == "MIAD") %>% pull(.DormList)