Home > Software engineering >  map left_join to a lot of matching tables by there names in a list
map left_join to a lot of matching tables by there names in a list

Time:07-27

How can I join tables when there are many each related to another?

In the example I have two sets of tables. I try to add the totals from the tables from 2021 to the tables from 2022 as a new column with the name 2021. I have more than 100 of such table sets for 4 years back. Therefor I look for a possibility to map (or loop) the left_joins (or cbind or whatever) by a list of the table names. This is my naive try:

library(dplyr)

# Table 1 from 2022
T1 <- tibble(Variable = c("DS", "SR", "TI", "Total"), 
            Total = c("71%", "24%", "5%", "100%"))
T1

# Table 1 from 2021
T1_2021 <- tibble(Variable = c("DS", "SR", "TI", "Total"), 
                  Total = c("40%", "40%", "20%", "100%"))
T1_2021

# here I simple left_join the T1 and T1_2021 to apply the Total from 2021 to a new column '2021' into T_2022_2021 (works)
T1_2022_and_total_of_2021 <- left_join(T1, T1_2021 %>% 
                           select(Variable, Total) %>%
                           rename('2021' = Total),
                         by = "Variable") 
T1_2022_and_total_of_2021

# other related tables
## Table 2 from 2022
T2 <- tibble(Variable = c("m", "f", "Total"), 
             Total = c("45", "55%", "100%"))

## Table 2 from 2021
T2_2021 <- tibble(Variable = c("m", "f", "Total"), 
                  Total = c("48%", "52%", "100%"))

# then I have a List with the names of my tables from 2022 (in real more than 100)

tab_list <- list("T1", "T2")

# then I try to map the left_joins for every table from 2021 to the equivalent table from 2022 (T1, T2)
map(tab_list, ~left_join(.x, paste0(.x, "_2021") %>% 
                           select(Variable, Total) %>%
                           rename('2021' = Total),
                         by = "Variable"))

How can I get this to work?

CodePudding user response:

We need the values of the objects. Use get or mget (for multiple elements)

library(purrr)
library(dplyr)
map(tab_list, ~left_join(get(.x), get(paste0(.x, "_2021")) %>% 
                           select(Variable, Total) %>%
                           rename('2021' = Total),
                         by = "Variable"))

-output

[[1]]
# A tibble: 4 × 3
  Variable Total `2021`
  <chr>    <chr> <chr> 
1 DS       71%   40%   
2 SR       24%   40%   
3 TI       5%    20%   
4 Total    100%  100%  

[[2]]
# A tibble: 3 × 3
  Variable Total `2021`
  <chr>    <chr> <chr> 
1 m        45    48%   
2 f        55%   52%   
3 Total    100%  100%  
  • Related