Home > Enterprise >  Multiple Replace/Lookups
Multiple Replace/Lookups

Time:04-06

I have the data below:

data_1 = data.frame(var1 = c("A", "B", "C"), var2 = c("A1", "A2", "A3"))
data_1$var1 = as.factor(data_1$var1)
data_1$var2 = as.factor(data_1$var2)

  var1 var2
1    A   A1
2    B   A2
3    C   A3

I then have this Lookup Table:

lookup = data.frame(var_1_value = c("A", "B", "C"), var_1_code = c(1,2,3), var_2_value = c("A1", "A2", "A3"), var_2_code = c(4,5,6))

  var_1_value var_1_code var_2_value var_2_code
1           A          1          A1          4
2           B          2          A2          5
3           C          3          A3          6

I want to replace the values in "data_1" with the "codes" from the Lookup Table. I tried to use this question as inspiration (Replace values in a dataframe based on lookup table):

lookup1 <- setNames(as.character(lookup$var_1_code), lookup$var_1_value)
res <- data.frame(lapply(data_1, function(i) lookup1[i]))
rownames(res) <- NULL

But this is not giving the intended output:

> res
  var1 var2
1    1    1
2    2    2
3    3    3

I thought that maybe this is happening because I never mentioned "var2" in the previous code, but this also did not work:

lookup1 <- setNames(as.character(lookup$var_1_code), lookup$var_1_value, as.character(lookup$var_2_code), lookup$var_2_value)
res <- data.frame(lapply(data_1, function(i) lookup1[i]))
rownames(res) <- NULL

Error in setNames(as.character(lookup$var_1_code), lookup$var_1_value,  : 
  unused arguments (as.character(lookup$var_2_code), lookup$var_2_value)
  • Can someone please show me how to perform Lookup/Replace on multiple columns at the same time? (e.g. if there 5 columns, 15 columns, 25 columns, etc.)

The intended output should look like this:

> res
  var1 var2
1    1    4
2    2    5
3    3    6

Thank you!

CodePudding user response:

You could use sapply:

res <- data.frame(
  sapply(
    seq_along(data_1), 
    function(i) lookup[match(data_1[, i], lookup[, 2 * i - 1]), 2 * i]
    )
  )

colnames(res) <- colnames(data_1)

The resulting res looks like

  var1 var2
1    1    4
2    2    5
3    3    6
  • Related