Home > Enterprise >  Efficient recoding of numeric variables into a factor in a data.frame
Efficient recoding of numeric variables into a factor in a data.frame

Time:09-21

In recoding values of numeric variables like var1 below into character values, sometimes there is an easy patter. For example, suppose numeric values 1:4 in var1 need to be recoded as LETTERS[27-(4:1)], respectively.

In such situations, is it possible to avoid writing var1 = recode(var1,1="W",2="X",3="Y",4="Z") and instead loop the recoding?

library(tidyverse)

(dat <- data.frame(var1 = rep(1:4,2), id = 1:8))

mutate(dat, var1 = recode(var1,`1`="W",`2`="X",`3`="Y",`4`="Z")) # This works but can we 
                                                                 # loop it as well?

CodePudding user response:

We can use a vectorized approach, no loops necessary. tail and base subsetting with [ will do the trick here.

library(dplyr)

dat %>% mutate(var1=tail(LETTERS, max(var1))[var1] %>% as.factor)
  var1 id
1    W  1
2    X  2
3    Y  3
4    Z  4
5    W  5
6    X  6
7    Y  7
8    Z  8

data

dat <- data.frame(var1 = rep(1:4,2), id = 1:8)

data2

dat2 <- data.frame(var1 = c(2,1,3,1,4:1), id = 1:8))

  var1 id
1    2  1
2    1  2
3    3  3
4    1  4
5    4  5
6    3  6
7    2  7
8    1  8

output2

  var1 id
1    X  1
2    W  2
3    Y  3
4    W  4
5    Z  5
6    Y  6
7    X  7
8    W  8

CodePudding user response:

You can use -

library(dplyr)

dat %>% mutate(var1 = LETTERS[length(LETTERS)-max(var1)   var1])

#  var1 id
#1    W  1
#2    X  2
#3    Y  3
#4    Z  4
#5    W  5
#6    X  6
#7    Y  7
#8    Z  8

CodePudding user response:

you can also just use the labels argument of factor()

library(dplyr)
dat <- data.frame(var1 = rep(1:4,2), id = 1:8) %>% 
  mutate(var1 = factor(var1, labels = tail(LETTERS, 4)))
dat

  var1 id
1    W  1
2    X  2
3    Y  3
4    Z  4
5    W  5
6    X  6
7    Y  7
8    Z  8
  • Related