I am able to check for a specific column if it exists using 'contains
' in dplyr
.
I struggle with evaluating the summary of the expression if it does not exist.
Here is my code snippet:
df <- Prod%>%
group_by(Entity)%>%
select(Entity,`Cum.Oil`,`Cum.Gas`,contains("EUR")%>%
summarise(Oil = mean(`Cum.Oil`), Gas = mean(`Cum.Gas`), EUR=mean(EUR))
How can I ignore 'EUR' expression in the summarise expression if the EUR column does not exist?
CodePudding user response:
Something like this should work:
df <- Prod%>%
group_by(Entity)%>%
summarise(across(any_of(c('Cum.Oil', 'Cum.Gas', 'Eur')), ~mean(.x),
.names = '{.col %>% str_remove("Cum.")}' )
Can't test without some reprex, though.
Tip: You can also use any_of
in select statements:
df <- Prod%>%
group_by(Entity)%>%
select(any_of(c('Entity', 'Cum.Oil', 'Cum.Gas', "EUR"))
CodePudding user response:
If it is more convinient you can use ifelse
constructions within dplyr pipes as well though I consider @Juan answer more Rly elegant:
Prod <- data.frame(Entity = c("a", "b", "a"),`Cum.Oil` = 1:3,`Cum.Gas`=c(2,4,6)
, EUR = c(7,9,9)
)
Prod %>%
group_by(Entity) %>% {
ifelse(exists("EUR", .),
. <- summarise(., Oil = mean(`Cum.Oil`),
Gas = mean(`Cum.Gas`),
EUR = mean(EUR)
),
. <- summarise(., Oil = mean(`Cum.Oil`),
Gas = mean(`Cum.Gas`)
)
)
.
}
CodePudding user response:
Juan C's answer is the way to go about this.
But just as reference, we can use purrr::when()
to create pipe friendly conditionals. When we want to use functions like exists()
in when()
we can specify the envir
argument as follows: exists("EUR", envir = as.environment(.))
.
library(tidyverse)
Prod <- data.frame(Entity = c("a", "b", "a"),
`Cum.Oil` = 1:3,
`Cum.Gas`=c(2,4,6),
EUR = c(7,9,9))
Prod %>%
group_by(Entity)%>%
select(Entity,`Cum.Oil`,`Cum.Gas`, contains("EUR")) %>%
when(exists("EUR", envir = as.environment(.)) ~
summarise(., Oil = mean(`Cum.Oil`), Gas = mean(`Cum.Gas`), EUR=mean(EUR)),
TRUE ~
summarise(., Oil = mean(`Cum.Oil`), Gas = mean(`Cum.Gas`))
)
#> # A tibble: 2 × 4
#> Entity Oil Gas EUR
#> <chr> <dbl> <dbl> <dbl>
#> 1 a 2 4 8
#> 2 b 2 4 9
Created on 2022-11-16 with reprex v2.0.2