Home > Software engineering >  dplyr conditionally mutate data type
dplyr conditionally mutate data type

Time:01-18

My aim is to conditionally mutate a data type using tidyverse. Here is a reproducible example. For instance, I want to change the column cyl to a factor. However, the factors levels and labels parameters will depend on whether the user has supplied an object bin.order or left it NULL. I know how to do this outside of tidyverse, but looking for a more succinct way through a tidyverse function.

mtcars %>% 
  mutate(cyl = ifelse(is.null(bin.order), 
                      factor(x = cyl, levels = sort(unique(cyl)), labels = sort(unique(cyl))), 
                      factor(x = cyl, levels = bin.order, labels = bin.order)))

The desired would result would be something like this:

# if bin.order is null
mtcars %>% 
  mutate(cyl = factor(x = cyl, levels = sort(unique(cyl)), labels = sort(unique(cyl))))

# if bin.order is not null
bin.order = c(4, 6, 8)
mtcars %>% 
  mutate(cyl = factor(x = cyl, levels = bin.order, labels = bin.order))

CodePudding user response:

I possible solution woul be to build a function

fct_if <- function(x,bin.order = NULL){
  
  if(is.null(bin.order)){
    output <- factor(x = x, levels = sort(unique(x)), labels = sort(unique(x)))
  }else{
    output <- factor(x = x, levels = bin.order, labels = bin.order)
  }
  return(output)
}


mtcars %>% 
  mutate(cyl = fct_if(cyl)) 


mtcars %>% 
  mutate(cyl = fct_if(cyl,bin.order = c(4, 6, 8))) 

CodePudding user response:

You could use the %||% operator (from rlang, re-exported by purrr). This uses the left-hand side if not NULL and the right-hand side otherwise:

library(dplyr)
library(purrr)

factor.bin.order <- function(x, bin.order = NULL) {
  factor(x, bin.order %||% sort(unique(x)))
}

mtcars2 <- mtcars %>% 
  mutate(
    cyl1 = factor.bin.order(cyl),
    cyl2 = factor.bin.order(cyl, c(6, 4, 8))
  )

levels(mtcars2$cyl1)
# "4" "6" "8"

levels(mtcars2$cyl2)
# "6" "4" "8"

Also note there’s no need to specify labels if they’re the same as levels, since this is the default the behavior.

  • Related