Home > other >  dplyr summarise across when column is sometimes missing
dplyr summarise across when column is sometimes missing

Time:01-04

This works:

df <- data.frame(a=c(1,2,3),b=c(4,2,3),c=c(2,5,1))

df %>% summarise(across( c(a,b,c), sum, na.rm=TRUE))

But this doesn't, because d doesn't exist:

df <- data.frame(a=c(1,2,3),b=c(4,2,3),c=c(2,5,1))

df %>% summarise(across( c(a,b,c,d), sum, na.rm=TRUE))

This is as part of a function which I then apply to a whole bunch of dataframes in a list using lapply, where d is present nearly all the time, so I can't just remove d.

I'm trying to get to something like this:

df <- data.frame(a=c(1,2,3),b=c(4,2,3),c=c(2,5,1))

df %>% summarise(across( any_of(c(a,b,c,d)), sum, na.rm=TRUE))

CodePudding user response:

You were nearly there, any_of expects a character vector:

library(dplyr)

df <- data.frame(a=c(1,2,3),b=c(4,2,3),c=c(2,5,1))

df %>% summarise(across( any_of(c("a","b","c","d")), sum, na.rm=TRUE))
#>   a b c
#> 1 6 9 8

Created on 2023-01-03 by the reprex package (v1.0.0)

CodePudding user response:

another approach to summarise existing numeric columns only:

library(dplyr)
df |> summarise(across(where(is.numeric), ~ sum(.x, na.rm = TRUE)))
  • Related