I am attempting to make a new dataset from my original monarch.data
df that consists of all variables for just three milkweed species (Milkweed_species
).
library(dplyr)
three_species <- filter(milkweed.data, milkweed.data$Milkweed_species == c("Asclepias nivea","Asclepias tuberosa", "Asclepias californica"))
This is the code I have tried, along with other attempts and it is returning a df that only has one of the species: enter image description here
What am I doing wrong/how do I fix this?
CodePudding user response:
In dplyr, you have to pass the variables in tidy format. And also, the comparison is not well formatted. Try this:
library(dplyr)
three_species <- filter(milkweed.data, Milkweed_species %in% c("Asclepias nivea","Asclepias tuberosa", "Asclepias californica"))
milkweed.data should be you original data. If, as you said, your original data is monarch.data, use this as the first argument of filter
.
CodePudding user response:
For filtering by multiple values, you need to use %in%
rather than ==
.
library(dplyr)
three_species <-
filter(milkweed.data,
Milkweed_species %in% c(
"Asclepias nivea",
"Asclepias tuberosa",
"Asclepias californica"
)
)
Output
Milkweed_species Trichomes Latex Toxin_amt Caterpillar_mass
1 Asclepias nivea 54 34 6 2
2 Asclepias tuberosa 1 3 2 56
3 Asclepias californica 5 1 6 2
4 Asclepias nivea 5 2 4 3
Essentially, for each value in milkweed.data$Milkweed_species
, we are checking to see if it exists in your supplied list. Then, filter
will keep only those that are TRUE
.
milkweed.data$Milkweed_species %in% c("Asclepias nivea", "Asclepias tuberosa", "Asclepias californica")
#[1] FALSE FALSE TRUE TRUE TRUE TRUE
Data
milkweed.data <- structure(list(Milkweed_species = c("Ascelpias amplexicaulis",
"Asclepias boliviensis", "Asclepias nivea", "Asclepias tuberosa",
"Asclepias californica", "Asclepias nivea"), Trichomes = c(3,
21, 54, 1, 5, 5), Latex = c(1, 5, 34, 3, 1, 2), Toxin_amt = c(3,
1, 6, 2, 6, 4), Caterpillar_mass = c(3, 5, 2, 56, 2, 3)), class = "data.frame", row.names = c(NA,
-6L))