Home > database >  How to create one dataframe from a list whose elements are lists containing one dataframe each in R
How to create one dataframe from a list whose elements are lists containing one dataframe each in R

Time:11-18

I am trying to build a dataframe of KML files. I have 52 different files in my dataset, and I have already uploaded them to R using the following code chunk:

#importing data
library(fs)
file_paths = fs::dir_ls("C:/Users/JoaoArbache/Desktop/Mestrado/carbono/dados")
file_contents = list()

for(i in seq_along(file_paths)) {
  file_contents[[i]] = st_read(
    dsn  = file_paths[[i]]
  )
}

#renaming the lists
numeros = list()
for(i in file_paths) {
  numeros[[i]] = str_extract(i, "\\d ") %>% 
                   as.numeric()
}
id = do.call(rbind.data.frame, numeros) %>% 
    filter(!row_number() %in% c(53))
colnames(id)[1] = "id"

file_contents = set_names(file_contents, id$id)

Ok, so far everything is alright. I have all of the 52 files uploaded in the file_contents list. This is the file_contents list Now, I need to get each of the 52 lists in file_contents, that contain one dataframe each, and build a single dataframe. So it should bind 52 different dataframes into a single one. I`ve tried lots of different ways to solve this problem, but I always failed.

Thanks for the support :)

I tried different loops, do.call function, some native R functions, but none of them worked. I`d either get an error message (e.g.

Error in `[[<-`(`*tmp*`, i, value = as.data.frame(i)) : 
  attempt to select more than one element in vectorIndex

) or just create a dataframe with the first element of the file_contents list. I was expecting to get a single dataframe with the 52 dataframes binded...

CodePudding user response:

Have you tried?

library(data.table)
rbindlist(file_contents, use.names = T, fill = T)

That assumes the col names are the same if they are not set use.names = F.

CodePudding user response:

You can use purrr::map on a list of files and build a single dataset if all of the files are regularly shaped (have the same columns). Below is an example using the nc dataset included with the sf package.

library(sf)
library(dplyr) 
library(purrr)

# make a temporary directory for the example
temp_dir <- tempdir()

# read nc data 
nc <- st_read(system.file("shape/nc.shp", package="sf"))

# create two datasets with all the same columns, but different data
one <- nc[1:3,] 
two <- nc[55,]

# write two separate kml objects to disk
st_write(one, paste0(temp_dir, "/", "one.kml"))
#> Writing layer `one' to data source `/tmp/RtmpfRjSGc/one.kml' using driver `KML'
#> Writing 3 features with 14 fields and geometry type Multi Polygon.
st_write(two, paste0(temp_dir, "/", "two.kml"))
#> Writing layer `two' to data source `/tmp/RtmpfRjSGc/two.kml' using driver `KML'
#> Writing 1 features with 14 fields and geometry type Multi Polygon.

# show the files on disk, just for illustration
list.files(path = temp_dir, pattern = "*.kml", full.names = T)
#> [1] "/tmp/RtmpfRjSGc/one.kml" "/tmp/RtmpfRjSGc/two.kml"

# read the two files & make them one dataframe:
together <- temp_dir %>%
  list.files(pattern = "*.kml", full.names = T) %>%
  map_dfr(st_read)
#> Reading layer `one' from data source `/tmp/RtmpfRjSGc/one.kml' using driver `LIBKML'
#> Simple feature collection with 3 features and 24 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -81.74091 ymin: 36.23402 xmax: -80.43509 ymax: 36.58977
#> Geodetic CRS:  WGS 84
#> Reading layer `two' from data source `/tmp/RtmpfRjSGc/two.kml' using driver `LIBKML'
#> Simple feature collection with 1 feature and 24 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -83.259 ymin: 35.29087 xmax: -82.74374 ymax: 35.79195
#> Geodetic CRS:  WGS 84

head(together)
#> Simple feature collection with 4 features and 24 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -83.259 ymin: 35.29087 xmax: -80.43509 ymax: 36.58977
#> Geodetic CRS:  WGS 84
#>        Name description timestamp begin  end altitudeMode tessellate extrude
#> 1      Ashe        <NA>      <NA>  <NA> <NA>         <NA>         -1       0
#> 2 Alleghany        <NA>      <NA>  <NA> <NA>         <NA>         -1       0
#> 3     Surry        <NA>      <NA>  <NA> <NA>         <NA>         -1       0
#> 4   Haywood        <NA>      <NA>  <NA> <NA>         <NA>         -1       0
#>   visibility drawOrder icon  AREA PERIMETER CNTY_ CNTY_ID  FIPS FIPSNO CRESS_ID
#> 1         -1        NA <NA> 0.114     1.442  1825    1825 37009  37009        5
#> 2         -1        NA <NA> 0.061     1.231  1827    1827 37005  37005        3
#> 3         -1        NA <NA> 0.143     1.630  1828    1828 37171  37171       86
#> 4         -1        NA <NA> 0.144     1.690  1996    1996 37087  37087       44
#>   BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79                       geometry
#> 1  1091     1      10  1364     0      19 MULTIPOLYGON (((-81.47258 3...
#> 2   487     0      10   542     3      12 MULTIPOLYGON (((-81.2397 36...
#> 3  3188     5     208  3616     6     260 MULTIPOLYGON (((-80.45612 3...
#> 4  2110     2      57  2463     8      62 MULTIPOLYGON (((-82.74374 3...

Created on 2022-11-17 by the reprex package (v2.0.1)

  • Related