Home > Mobile >  Apply a function between two lists of data frames
Apply a function between two lists of data frames

Time:08-07

I have the following data example and code:

lt1 <- list(df1 <- data.frame(V1 = c("a", "b"),
                              V2 = c("b", "c"),
                              V3 = c(1, 2)),
            df2 <- data.frame(V1 = c("x", "y"),
                              V2 = c("x", "z"),
                              V3 = c(1, 2)))
lvls_func <- function(x) {
  x[1:2] %>% 
    unlist() %>% 
    unique() %>% 
    sort()
  } 
lt_lvls <- lapply(lt1, lvls_func)


complete_func <- function(x) {
  tidyr::complete(x[1] = factor(x[1], levels = lt_lvls),
                  x[2] = factor(x[2], levels = lt_lvls),
                  x[3] = x[3],
                  fill = list(x[3] = 0))
  }

lt1_final <- lapply(lt1, complete_func)

I have difficulty building my complete_func().

I getting this error when I run my complete_func()

Error: unexpected '=' in:
"complete_func <- function(x) {
  tidyr::complete(x[1] ="

In my final list lt1_final I expect this output:

lt1_final <- list(df1 <- data.frame(V1 = c("a", "b", "a", "a", "b", "b", "c", "c", "c"),
                                    V2 = c("b", "c", "a", "c", "b", "a", "a", "b", "c"),
                                    V3 = c(1, 2, 0, 0, 0, 0, 0, 0, 0)),
                  df2 <- data.frame(V1 = c("x", "y", "x", "x", "y", "y", "z", "z", "z"),
                                    V2 = c("x", "z", "y", "z", "y", "x", "z", "x", "y"),
                                    V3 = c(1, 2, 0, 0, 0, 0, 0, 0, 0)))

Thanks all help

CodePudding user response:

As the lt_lvls is a list of levels, we may need either Map (from base R) or use purrr::map2.

In addition, create the function by making use of across. There are multiple changes in the function

  1. Add an argument lvls in the function
  2. Convert the columns 1 to 2 to factor by looping across within mutate, specify the lvls
  3. Apply complete on the subset of data using either splicing (!!!) (or could use invoke/exec), and specify the fill as a named list with dplyr::lst (or regular list with setNames)
library(dplyr)
library(tidyr)
library(purrr)
complete_func <- function(x, lvls) {
  x %>%
     dplyr::mutate(across(1:2, factor, levels =lvls)) %>%
     tidyr::complete(!!! .[1:2], fill = dplyr::lst(!! names(.)[3] := 0)) %>%
     arrange(across(3, ~ .x == 0))
     }

-testing

map2(lt1, lt_lvls, ~ complete_func(.x, .y))
[[1]]
# A tibble: 9 × 3
  V1    V2       V3
  <fct> <fct> <dbl>
1 a     b         1
2 b     c         2
3 a     a         0
4 a     c         0
5 b     a         0
6 b     b         0
7 c     a         0
8 c     b         0
9 c     c         0

[[2]]
# A tibble: 9 × 3
  V1    V2       V3
  <fct> <fct> <dbl>
1 x     x         1
2 y     z         2
3 x     y         0
4 x     z         0
5 y     x         0
6 y     y         0
7 z     x         0
8 z     y         0
9 z     z         0
  • Related