Home > Software engineering >  How to import and read a shapefile directly from Github repository in R?
How to import and read a shapefile directly from Github repository in R?

Time:02-05

I like to store some things in the Github repository to be accessed and read in R anywhere, such as CSV files that can be read from the repository link without the need to download to a directory on my computer.

But I would like to do the same thing with the shapefile files, which are gospatial files divided into 4 parts or extensions: *.shp, *.shx, *.prj, *dbf, each with its function . The *.shp extension is read by spatially supported software.

I tried to load a shapefile from the github repository but I couldn't.

library(rgdal) 
shapefile <- readOGR("http://wesleysc352.github.io/seg_s3_r3_m10_fix_estat_amost_val.shp")

CodePudding user response:

I'm not sure if there's an easier way, but here's a function to download all the files to a local computer and then read them in.


library(stringr)
library(sf)


load_shp <- function(shape_link, dest_dir="shape_files"){

  #check for dest dir, create it if it doesn't exist
  if(!dir.exists(dest_dir)){
  
    dir.create(dest_dir) 
    
  }
  
#remove file extension (everything after last period)  
shape_link <- str_remove(url, "\\.[^\\.] $")

# Extract file name (everything after last /)
file_name <- shape_link %>% str_extract("/[^/] $.*") %>% str_remove("/")


# define extensions
ext <- c(".shp", ".shx", ".prj", ".dbf")

#create list of urls
urls <- paste0(shape_link, ext)

#create a list of download files
downloaded_files <- paste0(file_name,ext)

for (i in seq_along(urls)) {

download.file(urls[i],downloaded_files[i])
  
}

shape <- st_read(downloaded_files[1])

shape
}


shape <- load_shp("http://wesleysc352.github.io/seg_s3_r3_m10_fix_estat_amost_val.shp")

CodePudding user response:

You can use GDAL Virtual File Systems with sf::st_read(), the one for reading files over HTTP & FTP is /vsicurl/ and apparently it's capable enough to figure out which additional files it should pull to make Shapefiles work:

library(sf)
#> Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE
shp <- st_read("/vsicurl/http://wesleysc352.github.io/seg_s3_r3_m10_fix_estat_amost_val.shp")
#> Reading layer `seg_s3_r3_m10_fix_estat_amost_val' from data source 
#>   `/vsicurl/http://wesleysc352.github.io/seg_s3_r3_m10_fix_estat_amost_val.shp' 
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 2308 features and 2 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 752896 ymin: 7734010 xmax: 761686 ymax: 7744258
#> Projected CRS: WGS 84 / UTM zone 21S

head(shp)
#> Simple feature collection with 6 features and 2 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 753792 ymin: 7744136 xmax: 753880 ymax: 7744188
#> Projected CRS: WGS 84 / UTM zone 21S
#>     id amostra                       geometry
#> 1 1996       4 POLYGON ((753854 7744188, 7...
#> 2 2206       4 POLYGON ((753792 7744176, 7...
#> 3 2254       4 POLYGON ((753872 7744172, 7...
#> 4 2460       4 POLYGON ((753870 7744170, 7...
#> 5 2707       4 POLYGON ((753840 7744158, 7...
#> 6 2814       4 POLYGON ((753838 7744150, 7...

Created on 2023-02-05 with reprex v2.0.2

Though depending on your use case, perhaps GeoJSON and/or GeoPackage are more suitable for versioning and remote access, sf (as most other geospatial libraries and tools) can read and write both.

  • Related