Can you suggest a workaround for this error I'm triggering? (in R 3.6.2)
Using a case_when
in a mutate
, trying to test if a column is present, and only then use its value:
library(tidyverse)
aCaseFn <- function(df){
df %>%
mutate(UP =
case_when("plays" %in% names(df) ~ toupper(plays),
"band" %in% names(df) ~ toupper(band),
TRUE ~ NA_character_))
}
What I'm expecting is
R > aCaseFn(band_instruments)
# A tibble: 3 x 3
name plays UP
<chr> <chr> <chr>
1 John guitar GUITAR
2 Paul bass BASS
3 Keith guitar GUITAR
but instead I get this error error
R > aCaseFn(band_instruments)
Error: Problem with `mutate()` input `UP`.
x object 'band' not found
ℹ Input `UP` is `case_when(...)`.
It appears that the toupper(band)
is getting evaluated, even tho (I'd think) it shouldn't ever be reached with this argument - both because the 1st branch ("plays" %in% names(df)
) is TRUE and because the 2nd branch's conditional ("band" %in% names(df)
) is FALSE.
So what would be a good workaround?
CodePudding user response:
Easier option is any_of
- create two formal arguments - one for inputting the dataset and second for the column names to convert to uppercase as string (nms)., loop across
any_of
the columns in 'nms' i.e. if it will only loop over the columns that exist in the data.frame and leave out the ones that are not present from the vector, convert to uppercase and change the column names with .names
.
aCaseFn <- function(df, nms)
{
df %>%
mutate(across(any_of(nms), toupper, .names = "UP_{.col}"))
}
-testing
str1 <- c("plays", "band")
> aCaseFn(band_instruments, str1)
name plays UP_plays
1 John guitar GUITAR
2 Paul bass BASS
3 Keith guitar GUITAR
NOTE: case_when/if_else/ifelse
requires all the arguments to be of same length i.e. "plays" %in% names(df)
returns a single TRUE/FALSE output where as toupper(plays)
length
will be the nrow(df)
. Here if/else
would be more useful..
data
band_instruments <- structure(list(name = c("John", "Paul",
"Keith"), plays = c("guitar",
"bass", "guitar")), class = "data.frame", row.names = c("1",
"2", "3"))
CodePudding user response:
Since the set of columns is fixed for all rows, you don't have to check it row-wise with case_when
. I think you might want to determine the name of the target column first, and then use it in mutate
:
target_columns <- c('plays', 'band')
col_n <- which(target_columns %in% colnames(df))
up_column <- target_columns[if (length(col_n) > 0) min(col_n) else col_n]
df %>%
mutate(
UP = if (length(up_column) > 0) {
toupper(.data[[up_column]])
} else {
NA_character_
}
)