Home > Software design >  Melt list of identical dataframes to one dataframe in R
Melt list of identical dataframes to one dataframe in R

Time:09-09

I have a list waterlevels of nine identical dataframes containing information about piezometers K01 to K09. Each of these dataframes contains 96860 rows, all of them having 13 variables, except for one having 21 variables (the extra columns in these one are not important).

$K01
           dateandtime seconds pressure_kPa temperature_.C baropressure_kpa barocompensation_kpa waterlevel_cm lengthcables_cm
1  2019-11-05 00:00:00       0      111.482         13.261           98.854               12.628      128.7697             490
2  2019-11-05 00:15:00     900      111.506         13.261           98.883               12.623      128.7188             490
3  2019-11-05 00:30:00    1800      111.511         13.261           98.872               12.639      128.8819             490
4  2019-11-05 00:45:00    2700      111.544         13.261           98.898               12.646      128.9533             490
5  2019-11-05 01:00:00    3600      111.536         13.313           98.913               12.623      128.7188             490 

I want to melt this list to one dataframe containing only one variable barocompensation_kpa and the dateandtime variable using following code:

waterlevels_all <- melt(waterlevels, id.vars=c("dateandtime", "barocompensation_kpa"))

But then the resulting dataframe waterlevels_all has 10364020 observations, while only this df only should have 871740 rows (9*96860). So it has to have 3 variables: dateandtime, barocompensation_kpa and L1 being the name of corresponding dataframe (= name of the piezometer - K01 to K09).

> head(waterlevels_all)
        dateandtime barocompensation_kpa variable value  L1
1 2019-11-05 00:00:00         17.55336  seconds     0 K01
2 2019-11-05 00:15:00         17.55489  seconds   900 K01
3 2019-11-05 00:30:00         17.55703  seconds  1800 K01
4 2019-11-05 00:45:00         17.55347  seconds  2700 K01
5 2019-11-05 01:00:00         17.55540  seconds  3600 K01
6 2019-11-05 01:15:00         17.56050  seconds  4500 K01

How is this possible? How do I get the right dataframe?

CodePudding user response:

Here are some options with tidyverse - Loop over the list with imap, select the columns of interest and create the new column 'L1' with the list element name (.y)

library(dplyr)
library(purrr)
imap_dfr(waterlevels, ~ .x %>% 
   select(dateandtime, barocompensation_kpa) %>%
   mutate(L1 = .y))

Or with map, just select the columns and create the new column with .id (_dfr returns a single dataset by rbinding the list elements)

map_dfr(waterlevels, ~ .x %>% 
     select(dateandtime, barocompensation_kpa), .id = "L1")

CodePudding user response:

Try lapply with a subsequent Map and rbind.

lapply(lst, subset, select=c('dateandtime', 'barocompensation_kpa')) |>
  Map(f=`[<-`, 'L1', value=names(lst)) |>
  # c(make.row.names=FALSE) |>  ## optional, to reset row names
  do.call(what=rbind)
#       dateandtime barocompensation_kpa  L1
# K01.1           0                    0 K01
# K01.2           0                    0 K01
# K01.3           0                    0 K01
# K02.1           0                    0 K02
# K02.2           0                    0 K02
# K02.3           0                    0 K02
# K03.1           0                    0 K03
# K03.2           0                    0 K03
# K03.3           0                    0 K03

Data:

lst <- list(K01 = structure(list(dateandtime = c(0, 0, 0), seconds = c(0, 
0, 0), pressure_kPa = c(0, 0, 0), temperature_.C = c(0, 0, 0), 
    baropressure_kpa = c(0, 0, 0), barocompensation_kpa = c(0, 
    0, 0), waterlevel_cm = c(0, 0, 0), lengthcables_cm = c(0, 
    0, 0)), row.names = c(NA, -3L), class = "data.frame"), K02 = structure(list(
    dateandtime = c(0, 0, 0), seconds = c(0, 0, 0), pressure_kPa = c(0, 
    0, 0), temperature_.C = c(0, 0, 0), baropressure_kpa = c(0, 
    0, 0), barocompensation_kpa = c(0, 0, 0), waterlevel_cm = c(0, 
    0, 0), lengthcables_cm = c(0, 0, 0)), row.names = c(NA, -3L
), class = "data.frame"), K03 = structure(list(dateandtime = c(0, 
0, 0), seconds = c(0, 0, 0), pressure_kPa = c(0, 0, 0), temperature_.C = c(0, 
0, 0), baropressure_kpa = c(0, 0, 0), barocompensation_kpa = c(0, 
0, 0), waterlevel_cm = c(0, 0, 0), lengthcables_cm = c(0, 0, 
0), X9 = c(0, 0, 0), X10 = c(0, 0, 0), X11 = c(0, 0, 0), X12 = c(0, 
0, 0), X13 = c(0, 0, 0), X14 = c(0, 0, 0), X15 = c(0, 0, 0), 
    X16 = c(0, 0, 0), X17 = c(0, 0, 0), X18 = c(0, 0, 0), X19 = c(0, 
    0, 0), X20 = c(0, 0, 0), X21 = c(0, 0, 0)), class = "data.frame", row.names = c(NA, 
-3L)))
  • Related