Home > OS >  Converting multiple columns to factors and releveling with mutate(across)
Converting multiple columns to factors and releveling with mutate(across)

Time:08-24

dat <- data.frame(Comp1Letter = c("A", "B", "D", "F", "U", "A*", "B", "C"),
                   Comp2Letter = c("B", "C", "E", "U", "A", "C", "A*", "E"),
                   Comp3Letter = c("D", "A", "C", "D", "F", "D", "C", "A"))  

GradeLevels <- c("A*", "A", "B", "C", "D", "E", "F", "G", "U")

I have a dataframe that looks something like the above (but with many other columns I don't want to change).

The columns I am interested in changing contains lists of letter grades, but are currently character vectors and not in the right order.

I need to convert each of these columns into factors with the correct order. I've been able to get this to work using the code below:

factordat <-
    dat %>%
      mutate(Comp1Letter = factor(Comp1Letter, levels = GradeLevels)) %>%
      mutate(Comp2Letter = factor(Comp2Letter, levels = GradeLevels)) %>%
      mutate(Comp3Letter = factor(Comp3Letter, levels = GradeLevels)) 

However this is super verbose and chews up a lot of space.

Looking at some other questions, I've tried to use a combination of mutate() and across(), as seen below:

factordat <-
  dat %>%
    mutate(across(c(Comp1Letter, Comp2Letter, Comp3Letter) , factor(levels = GradeLetters))) 

However when I do this the vectors remain character vectors.

Could someone please tell me what I'm doing wrong or offer another option?

CodePudding user response:

You can do across as an anonymous function like this:

dat <- data.frame(Comp1Letter = c("A", "B", "D", "F", "U", "A*", "B", "C"),
                   Comp2Letter = c("B", "C", "E", "U", "A", "C", "A*", "E"),
                   Comp3Letter = c("D", "A", "C", "D", "F", "D", "C", "A"))  

GradeLevels <- c("A*", "A", "B", "C", "D", "E", "F", "G", "U")

dat %>%
  tibble::as_tibble() %>%
    dplyr::mutate(dplyr::across(c(Comp1Letter, Comp2Letter, Comp3Letter) , ~forcats::parse_factor(., levels = GradeLevels)))

# # A tibble: 8 × 3
#   Comp1Letter Comp2Letter Comp3Letter
#   <fct>       <fct>       <fct>      
# 1 A           B           D          
# 2 B           C           A          
# 3 D           E           C          
# 4 F           U           D          
# 5 U           A           F          
# 6 A*          C           D          
# 7 B           A*          C          
# 8 C           E           A     

You were close, all that was left to be done was make the factor function anonymous. That can be done either with ~ and . in tidyverse or function(x) and x in base R.

  • Related