Home > Mobile >  Produce dataframe from rollapply that combines function names and those of original dataset
Produce dataframe from rollapply that combines function names and those of original dataset

Time:01-17

Is it possible to output a dataframe (or similar) following rollapply with merged column headers - that is combining my function object names with the original dataset (sensor_data)?

The dataframe output I want would look like this:

|mean.x|median.x|max.x|mn.x|sd.x|mean.y|median.y|max.y|mn.y|sd.y|....|meanz...,|meanODBA|

Here's the function:

time_domain_summary <- function(values) {

  features <- data.frame(
    mean = mean(values, na.rm = TRUE),
    median = quantile(values, probs = c(0.5), na.rm = TRUE), 
    mx =  max(values, na.rm = T),
    mn = min(values, na.rm = T), 
    sd = sd(values)
    )
 return(features)
}

Here's the call using rollapply:

feats <- data.frame(rollapply(sensor_data, FUN = time_domain_summary, width = 2, by = 1, by.column = T, align = c("left"), partial = F))

Example data

    structure(list(x = c(-0.45, -0.35, -0.375, -0.325, -0.25, -0.225, 
-0.125, -0.175, -0.175, -0.35), y = c(-0.725, -0.575, -0.8, -0.775, 
-0.525, -0.625, -0.75, -0.775, -0.725, -0.725), z = c(0.775, 
0.75, 0.85, 0.875, 0.575, 0.65, 0.85, 0.75, 0.825, 0.675), ODBA = c(0.155, 
0.14, 0.31, 0.325, 0.37, 0.21, 0.26, 0.23, 0.295, 0.04), VeDBA = c(0.110113577727726, 
0.0966953980290686, 0.179861057486049, 0.190984292547843, 0.22726636354727, 
0.138744369255116, 0.158587515271537, 0.133977610069743, 0.190065778087482, 
0.0308220700148449), energy = c(1.32875, 1.015625, 1.503125, 
1.471875, 0.66875, 0.86375, 1.300625, 1.19375, 1.236875, 1.10375
), mov.intensity = c(1.15271418833985, 1.00778221853732, 1.22601998352392, 
1.21320855585509, 0.817771361689806, 0.929381514772055, 1.1404494727957, 
1.09258866917061, 1.11214882097676, 1.05059506947253), mov.var = c(0, 
0.225, -0.15, 0.1, 0.025, 8.32667268468867e-17, 0.175, -0.175, 
0.125, -0.325)), row.names = c(NA, 10L), class = "data.frame")

CodePudding user response:

Instead of using by.column = TRUE, loop over the columns with lapply and apply the rollapply with time_domain_summary function on each columns, then return a list of output, and modify the column names of each of the list elements with the column name pasteed as prefix before we cbind the list elements

library(zoo)
lst1 <- lapply(sensor_data, \(x) rollapply(x,
   FUN = time_domain_summary, width = 2, by = 1, align = c("left"), 
   partial = FALSE))
out <- do.call(cbind, Map(\(x, y) {
       colnames(x) <- paste0(y, "_", colnames(x))
     x}, lst1, names(lst1)))

-output

> out[, 1:10]
      x_mean x_median   x_mx   x_mn       x_sd  y_mean y_median   y_mx   y_mn       y_sd
50%  -0.4000  -0.4000 -0.350 -0.450 0.07071068 -0.6500  -0.6500 -0.575 -0.725 0.10606602
50%1 -0.3625  -0.3625 -0.350 -0.375 0.01767767 -0.6875  -0.6875 -0.575 -0.800 0.15909903
50%2 -0.3500  -0.3500 -0.325 -0.375 0.03535534 -0.7875  -0.7875 -0.775 -0.800 0.01767767
50%3 -0.2875  -0.2875 -0.250 -0.325 0.05303301 -0.6500  -0.6500 -0.525 -0.775 0.17677670
50%4 -0.2375  -0.2375 -0.225 -0.250 0.01767767 -0.5750  -0.5750 -0.525 -0.625 0.07071068
50%5 -0.1750  -0.1750 -0.125 -0.225 0.07071068 -0.6875  -0.6875 -0.625 -0.750 0.08838835
50%6 -0.1500  -0.1500 -0.125 -0.175 0.03535534 -0.7625  -0.7625 -0.750 -0.775 0.01767767
50%7 -0.1750  -0.1750 -0.175 -0.175 0.00000000 -0.7500  -0.7500 -0.725 -0.775 0.03535534
50%8 -0.2625  -0.2625 -0.175 -0.350 0.12374369 -0.7250  -0.7250 -0.725 -0.725 0.00000000
  • Related