Home > Mobile >  R data.table way to create summary statistics table with self-defined function
R data.table way to create summary statistics table with self-defined function

Time:06-26

I am in the process of converting to data.table and so far have not been able to find a data.table way to create a table with summary statistics based on a self-defined function. Until now, I have used dplyr to accomplish this, for which I provide the code below. Is it possible to achieve a similar thing in a neat way using data.table?

library(dplyr)
library(mlbench)
data(BostonHousing)
df <- BostonHousing

fun_stats <- function(x) {
  min <- min(x, na.rm = TRUE)
  max <- max(x, na.rm = TRUE)
  mean <- mean(x, na.rm = TRUE)
  summary <- list(min = min, max = max, mean = mean)
}

stats <- df %>%
  select_if(is.numeric) %>%
  purrr::map(fun_stats) %>%
  bind_rows(., .id = "var") %>%
  mutate(across(where(is.numeric)))

CodePudding user response:

library(data.table)
library(mlbench)
data(BostonHousing)
dt <- as.data.table(BostonHousing)

fun_stats <- function(x) {
  min <- min(x, na.rm = TRUE)
  max <- max(x, na.rm = TRUE)
  mean <- mean(x, na.rm = TRUE)
  summary <- list(min = min, max = max, mean = mean)
}

dt[, rbindlist(lapply(.SD, fun_stats), idcol = "var"), 
   .SDcols = is.numeric]
#>         var       min      max        mean
#>      <char>     <num>    <num>       <num>
#>  1:    crim   0.00632  88.9762   3.6135236
#>  2:      zn   0.00000 100.0000  11.3636364
#>  3:   indus   0.46000  27.7400  11.1367787
#>  4:     nox   0.38500   0.8710   0.5546951
#>  5:      rm   3.56100   8.7800   6.2846344
#>  6:     age   2.90000 100.0000  68.5749012
#>  7:     dis   1.12960  12.1265   3.7950427
#>  8:     rad   1.00000  24.0000   9.5494071
#>  9:     tax 187.00000 711.0000 408.2371542
#> 10: ptratio  12.60000  22.0000  18.4555336
#> 11:       b   0.32000 396.9000 356.6740316
#> 12:   lstat   1.73000  37.9700  12.6530632
#> 13:    medv   5.00000  50.0000  22.5328063

Created on 2022-06-24 by the reprex package (v2.0.1)

  • Related