I would like to combine the summarise_if statement (summarise all numeric variables) with the summarise to count the amount of observations. In the iris example, I would like to
- count the number of observations per Species and add this count as a column in the new table
- summarise all numeric variables (Sepal.Length,Sepal.Width, Petal.Length, Petal.Width) by Species.
Number 1) I can get by this code:
iris %>%
group_by(Species)%>%
summarise(n = n())
Number 2) I can get by this code:
iris %>%
group_by(Species)%>%
summarise_if(is.numeric, median, na.rm = TRUE)
But I am struggling with combining both. Just pipeing one after the other does give me a different result. My desired output would be this:
CodePudding user response:
Use across
:
iris %>%
group_by(Species) %>%
summarise(n = n(), across(where(is.numeric), median, na.rm = TRUE))
For those interested, the same thing in data.table
:
setDT(iris)
iris[, j = data.frame(n = .N, lapply(.SD, median, na.rm = TRUE)),
.SDcols = names(iris)[sapply(iris, is.numeric)],
by = Species]