Home > Back-end >  using summerise of package dplyr
using summerise of package dplyr

Time:10-29

I would like to know how many values are used to calculate the mean when using the summarize function

    table<- df %>%  group_by(x) %>%   summarise_if(is.numeric, mean, na.rm = TRUE)

CodePudding user response:

Add a count summary too. (by seeing if is na and then summing them)

Note, summarise_if has been superseded by across()

table<- df %>%  group_by(x) %>%
    summarise(across(where(is.numeric), list(mean = ~ mean(.x, na.rm = TRUE), n = ~sum(!is.na(.x)))))

CodePudding user response:

I may be wrong, but I believe simply using dplyr's count() should work. See below:

# Creating a demonstrative data frame
colors <- c('red', 'green', 'red', 'green', 'red', 'green', 'green')
obs <- c(1, 2, 3, 1, 5, 2, 6)
mytable <- data.frame(colors, obs)

# Checking the summarise function
mytable %>%
  group_by(colors) %>%
  summarise_if(is.numeric, mean)

# First approach, using summarise, n = n
mytable %>%
  group_by(colors) %>%
  summarise(n = n())

# Second, more elegant approach using count
mytable %>% 
  count(colors)

If needed, you can add in a filter or subset function to test whether data is numeric.

  • Related