How do I update the values of a dataset using matched values in a legend?-CodePudding

I have a "legend" with a key and several columns:

library(dplyr)

(legend <- tibble(key = 1:3, 
               value_1 = c("x", "y", "z"),
               value_2 = c("fg", "d", "f"),
               value_3 = c("dsf", "sdf", "sdf")))
#> # A tibble: 3 × 4
#>     key value_1 value_2 value_3
#>   <int> <chr>   <chr>   <chr>  
#> 1     1 x       fg      dsf    
#> 2     2 y       d       sdf    
#> 3     3 z       f       sdf

In my real data, I have the same columns but they take the values of the legend key.

(data = tibble(value_1 = c(1, 3, 3),
              value_2 = c(2, 2, 2),
              value_3 = c(1, 2, 3)))
#> # A tibble: 3 × 3
#>   value_1 value_2 value_3
#>     <dbl>   <dbl>   <dbl>
#> 1       1       2       1
#> 2       3       2       2
#> 3       3       2       3

It is possible to match my datasets together in a way to "update" my dataset using the legend?

# expected output
  tibble(value_1 = c("x", "z", "z"),
         value_2 = c("d", "d", "d"),
         value_3 = c("dsf", "sdf", "sdf"))
#> # A tibble: 3 × 3
#>   value_1 value_2 value_3
#>   <chr>   <chr>   <chr>  
#> 1 x       d       dsf    
#> 2 z       d       sdf    
#> 3 z       d       sdf

^{Created on 2022-06-02 by the reprex package (v2.0.1)}

CodePudding user response：

As these are position indexes in 'data' columns, loop across the 'value_' columns of 'legend', get the corresponding column based on the column name (cur_column()) from 'data', use that as numeric index to subset the values of the 'legend' column

library(dplyr)
legend <- legend %>% 
   mutate(across(starts_with('value_'), ~ .x[data[[cur_column()]]]))

-output

legend
# A tibble: 3 × 4
    key value_1 value_2 value_3
  <int> <chr>   <chr>   <chr>  
1     1 x       d       dsf    
2     2 z       d       sdf    
3     3 z       d       sdf

CodePudding user response：

You can use purrr as well. This should be quicker than mutate(), but also assumes that the values are keyed equal to their row number.

library(purrr)

map_dfc(set_names(names(data)),
        ~ legend[[.x]][data[[.x]]])

# A tibble: 3 x 3
  value_1 value_2 value_3
  <chr>   <chr>   <chr>  
1 x       d       dsf    
2 z       d       sdf    
3 z       d       sdf