Home > Software engineering >  How to obtain a variable number of arguments by "..." or by a list in R
How to obtain a variable number of arguments by "..." or by a list in R

Time:07-24

I made this helper function to compare variable number of input tables using calls to janitor::compare_df_cols.

Sometimes I have a namedlist of dataframes, and sometimes I may write their names directly when they are 2 or 3.

I want the helper function to get indistinctly either ... or a list(), in all cases containing dataframes.

Ideally, I want ... to convert into a named-list with names preserving those of the variables in the calling side passed as arguments. (Ex. people, people2, people3)

... gets unnamed and works different if conversion to list() is defined inside or outside the function.

EXAMPLE DATASET

lastnames <- LETTERS[1:4] ;   names  <- c("uno", "dos", "tres", "cuatro"); 
age <- c(1:4)             ;   height <- seq(190,200,3)
people = data.frame(names, lastnames, age, height)
people2 = people %>% mutate( age = age   20)
people3 = people; people3$height[[3]] = 160

HELPER FUNCTION

helper_df_compare = function( ..., a_default_arg="def" ){
    ##### Compare mismatching columns types
    ##### NOTE that, it does not checks contents
    rbind(
         janitor::compare_df_cols( ..., return="mismatch" ) %>%
                  mutate( column_name = paste("!!!", column_name) ),
         janitor::compare_df_cols( ..., return="match" )
         ) %>%
    mutate_all( ~str_replace_all(.,c(
      "integer"="int",   "numeric"="num",   "character"="chr",   "factor"="fct",
      "POSIXct, POSIXt"="POSIXct"
      ) ) )
}

INTENDED OPTIONAL CALLING METHODS

helper_df_compare( database_rnamedlist )             # <- preferred

helper_df_compare( list(people, people2, people3) )
helper_df_compare( people, people2, people3 )        # <- preferred

helper_df_compare( list("A"=people, "B"=people2, "C"=people3) )
helper_df_compare( A=people, B=people2, C=people3 )  # <- preferred

CURRENT OUTPUTs: NOTE: column names should be the table name passed as argument

  column_name ..1_1 ..1_2 ..1_3
1     !!! age   int   num   int
2      height   num   num   num
3   lastnames   chr   chr   chr
4       names   chr   chr   chr

  column_name   A   B   C
1     !!! age int num int
2      height num num num
3   lastnames chr chr chr
4       names chr chr chr

EXPECTED OUTPUT:

  column_name people people2 people3
1     !!! age    int     num     int
2      height    num     num     num
3   lastnames    chr     chr     chr
4       names    chr     chr     chr

  column_name   A   B   C
1     !!! age int num int
2      height num num num
3   lastnames chr chr chr
4       names chr chr chr

CodePudding user response:

You can handle the direct input of data frames, both named and unnamed, like this:

helper_df_compare = function( ..., a_default_arg = "def" ){
  
  dots <- rlang::list2(...)
  args <- as.list(match.call())[-1]
  if(is.null(names(dots))) names(dots) <- rep('', length(dots))
  for(i in seq_along(dots)) {
    if(!nzchar(names(dots)[i])) names(dots)[i] <- as.character(args[[i]])
  }
   
   
    rbind(
         do.call(janitor::compare_df_cols, c(dots, return = "mismatch")) %>%
                  mutate( column_name = paste("!!!", column_name) ),
         do.call(janitor::compare_df_cols, c(dots, return = "match")) 
         ) %>%
    mutate_all( ~str_replace_all(.,c(
      "integer"="int",   "numeric"="num",   "character"="chr",   "factor"="fct",
      "POSIXct, POSIXt"="POSIXct"
      ) ) )
}

This allows

helper_df_compare(people, people2, people3)
#>   column_name people people2 people3
#> 1     !!! age    int     num     int
#> 2      height    num     num     num
#> 3   lastnames    chr     chr     chr
#> 4       names    chr     chr     chr

and this:

helper_df_compare(A = people, B = people2, C = people3)
#>   column_name   A   B   C
#> 1     !!! age int num int
#> 2      height num num num
#> 3   lastnames chr chr chr
#> 4       names chr chr chr
  • Related