I need to select the levels of Species in the dataset Iris (available in R) with the function subset() and calculate the mean of the column Petal.Length from the same dataset, everything with a for loop. I know that I can do this calculations with the function tappy, but the task consists in using a for loop.
I tried writing a vector in which I would put the results:
medie <- rep(NA,3)
names(medie) <- levels(iris$Species)
and then this as the loop:
for (i in 1:length(medie)){
medie[i] <- mean(subset(iris, Species==levels(Species))$Petal.Length)
}
but this are the results I get:
> medie
setosa versicolor virginica
3.796 3.796 3.796
Any help?
CodePudding user response:
I think you need to include i
in levels(Species)[i]
for (i in 1:length(medie)){
medie[i] <- mean(subset(iris, Species==levels(Species)[i])$Petal.Length)
}
> medie
setosa versicolor virginica
1.462 4.260 5.552
There is an argument called select
in subset
to select your target column, so you can use:
medie[i] <- mean(subset(iris, Species==levels(Species)[i], select = "Petal.Length"))
Here's a dplyr approach if you, someday, want to avoid for
loop.
library(dplyr)
iris %>%
group_by(Species) %>%
summarise(medie = mean(Petal.Length))