I'm trying to count the number of NA
values in each of 2 columns.
The code below works.
temp2 %>%
select(c18basic, c18ipug) %>%
summarise_all(funs(sum(is.na(.))))
But I get this warning:
Warning message:
`funs()` was deprecated in dplyr 0.8.0.
Please use a list of either functions or lambdas:
# Simple named list:
list(mean = mean, median = median)
# Auto named with `tibble::lst()`:
tibble::lst(mean, median)
# Using lambdas
list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
How can I rewrite my summarise_all
line using each of the above techniques: Simple list
and Auto named list
? Without using summarise_all
as it seems to have been superseded.
Thanks.
Note: I find TidyVerse documentation very difficult to understand. If someone can point me to a resource that could help me figure out these things on my own, I'd really appreciate it. Thank you in advance.
Note: I figured out how to do it using Lambda:
temp2 %>%
select(c18basic, c18ipug) %>%
summarise(across(everything(), ~ sum(is.na(.x))))
CodePudding user response:
Like this?
dplyr::summarize_all(list(sum= ~sum(is.na(.x), na.rm = T)))
Btw the select()
is not necessary, you could write
temp2 %>%
summarise(across(c(c18basic,c18ipug), ~ sum(is.na(.x))))
Ex.
library(dplyr)
mtcars[1,1] <- NA
mtcars %>%
summarise(across(c(mpg ,cyl), list(sum= ~sum(is.na(.x), na.rm = T))))
Results in
mpg_sum cyl_sum
1 0
I think you get the idea and see how easy the syntax is, when using more functions, e.g.
mtcars[1,1] <- NA
mtcars %>%
summarise(across(c(mpg, cyl), list(
Mean = ~ mean(., na.rm = T),
SD = ~ sd(., na.rm = T),
Min = ~ min(., na.rm = T),
Max = ~ max(., na.rm = T),
Obs. = ~ sum(!is.na(.))
)))