Home > Enterprise >  Using built-in datasets as function arguments
Using built-in datasets as function arguments

Time:11-27

I have the below code that is just a basic function but I get this error and I'm not sure why.

"> stats(mtcars,mpg) [1] NA Warning messages: 1: In min(df$variable) : no non-missing arguments to min; returning Inf 2: In max(df$variable) : no non-missing arguments to max; returning -Inf 3: In mean.default(df$variable) : argument is not numeric or logical: returning NA"

stats <- function(dataset, variable){
  min(dataset$variable)
  max(dataset$variable)
  median(dataset$variable)
  mean(dataset$variable)
  sd(dataset$variable)
}

stats(mtcars,mpg)

I tried putting mtcars into a dataframe and that didn't work. I am inexperienced with R so I do not know how to trouble shoot well.

CodePudding user response:

Here is another option:

library(tidyverse)


stats <- function(dataset, variable){
  dataset |>
    summarise(across({{variable}}, list(min = min, max = max, median = median, 
                                     mean = mean, sd = sd), .names = "{.fn}"))|>
  as.list.data.frame()
}

stats(mtcars, mpg) 
#> $min
#> [1] 10.4
#> 
#> $max
#> [1] 33.9
#> 
#> $median
#> [1] 19.2
#> 
#> $mean
#> [1] 20.09062
#> 
#> $sd
#> [1] 6.026948

or a different option:

stats <- function(dataset, variable){
  map(c(min, max, median, mean, sd), \(f) f(pull(dataset, {{variable}}))) |>
    set_names(c("min", "max", "median", "mean", "sd"))
}

stats(mtcars, mpg) 
#> $min
#> [1] 10.4
#> 
#> $max
#> [1] 33.9
#> 
#> $median
#> [1] 19.2
#> 
#> $mean
#> [1] 20.09062
#> 
#> $sd
#> [1] 6.026948

CodePudding user response:

We need [[ instead of $. If we pass unquoted variable, then convert it to string inside with deparse/substitute and use [[. Also, return as a list or vector (c) if we want to get more than one output

stats <- function(dataset, variable){
  variable <- deparse(substitute(variable))
  list(Min = min(dataset[[variable]], na.rm = TRUE),
  Max = max(dataset[[variable]], na.rm = TRUE),
  Median = median(dataset[[variable]], na.rm = TRUE),
  Mean = mean(dataset[[variable]], na.rm = TRUE),
  SD = sd(dataset[[variable]], na.rm = TRUE))
}

-testing

> stats(mtcars, mpg)
$Min
[1] 10.4

$Max
[1] 33.9

$Median
[1] 19.2

$Mean
[1] 20.09062

$SD
[1] 6.026948
  • Related