I would like to know how many values are used to calculate the mean
when using the summarize
function
table<- df %>% group_by(x) %>% summarise_if(is.numeric, mean, na.rm = TRUE)
CodePudding user response:
Add a count summary too. (by seeing if is na and then summing them)
Note, summarise_if
has been superseded by across()
table<- df %>% group_by(x) %>%
summarise(across(where(is.numeric), list(mean = ~ mean(.x, na.rm = TRUE), n = ~sum(!is.na(.x)))))
CodePudding user response:
I may be wrong, but I believe simply using dplyr's count()
should work. See below:
# Creating a demonstrative data frame
colors <- c('red', 'green', 'red', 'green', 'red', 'green', 'green')
obs <- c(1, 2, 3, 1, 5, 2, 6)
mytable <- data.frame(colors, obs)
# Checking the summarise function
mytable %>%
group_by(colors) %>%
summarise_if(is.numeric, mean)
# First approach, using summarise, n = n
mytable %>%
group_by(colors) %>%
summarise(n = n())
# Second, more elegant approach using count
mytable %>%
count(colors)
If needed, you can add in a filter
or subset
function to test whether data is numeric.