I have a "legend" with a key and several columns:
library(dplyr)
(legend <- tibble(key = 1:3,
value_1 = c("x", "y", "z"),
value_2 = c("fg", "d", "f"),
value_3 = c("dsf", "sdf", "sdf")))
#> # A tibble: 3 × 4
#> key value_1 value_2 value_3
#> <int> <chr> <chr> <chr>
#> 1 1 x fg dsf
#> 2 2 y d sdf
#> 3 3 z f sdf
In my real data, I have the same columns but they take the values of the legend key.
(data = tibble(value_1 = c(1, 3, 3),
value_2 = c(2, 2, 2),
value_3 = c(1, 2, 3)))
#> # A tibble: 3 × 3
#> value_1 value_2 value_3
#> <dbl> <dbl> <dbl>
#> 1 1 2 1
#> 2 3 2 2
#> 3 3 2 3
It is possible to match my datasets together in a way to "update" my dataset using the legend?
# expected output
tibble(value_1 = c("x", "z", "z"),
value_2 = c("d", "d", "d"),
value_3 = c("dsf", "sdf", "sdf"))
#> # A tibble: 3 × 3
#> value_1 value_2 value_3
#> <chr> <chr> <chr>
#> 1 x d dsf
#> 2 z d sdf
#> 3 z d sdf
Created on 2022-06-02 by the reprex package (v2.0.1)
CodePudding user response:
As these are position indexes in 'data' columns, loop across
the 'value_' columns of 'legend', get the corresponding column based on the column name (cur_column()
) from 'data', use that as numeric index to subset the values of the 'legend' column
library(dplyr)
legend <- legend %>%
mutate(across(starts_with('value_'), ~ .x[data[[cur_column()]]]))
-output
legend
# A tibble: 3 × 4
key value_1 value_2 value_3
<int> <chr> <chr> <chr>
1 1 x d dsf
2 2 z d sdf
3 3 z d sdf
CodePudding user response:
You can use purrr
as well. This should be quicker than mutate()
, but also assumes that the values are keyed equal to their row number.
library(purrr)
map_dfc(set_names(names(data)),
~ legend[[.x]][data[[.x]]])
# A tibble: 3 x 3
value_1 value_2 value_3
<chr> <chr> <chr>
1 x d dsf
2 z d sdf
3 z d sdf