Home > front end >  Reading a grib2 dataset with 4 dimensions and 2 variables with R
Reading a grib2 dataset with 4 dimensions and 2 variables with R

Time:10-19

I am trying to read a GRIB2 file with R. This file is a probabilistic meteorological forecast : 2 variables, 114 lead times, 18 longitudes, 24 latitudes, and 50 members.

I didn't manage to do it so I used a Python routine to convert the grib file into netCDF, and then I read the netCDF with R. But this raises many issues : I have to use python and specific packages, which are not available in a portable version. And I need the process to be run on a portable environment. I saw that I could read GRIB2 files with the terra package in R, with this post : https://gis.stackexchange.com/questions/396408/how-to-properly-extract-point-data-from-multi-raster-grib-file-in-r

Unfortunately, I couldn't manage to find a way to properly extract my data, with all these dimensions.

  require(terra)
  
  ## Isn't it possible to get them automatically ?
  lat_prev <- (rev(seq(42.875,48.625,by=0.25)))
  lon_prev <- (seq(3.375,7.625,by=0.25))
  
  latlon <- expand.grid(lon=lon_prev, lat=lat_prev)
  
  latlons <- terra::vect(latlon, geom=c('lon','lat'), crs=" proj=longlat")
  ## Not sure about that...
  pts <- project(latlons, " proj=lcc  lat_0=38.5  lon_0=262.5  lat_1=38.5  lat_2=38.5  x_0=0  y_0=0  R=6371229  units=m")
  
  grib_data <- terra::rast(destfile_CF)
  ## gives a data frame of NaN
  e1 <- extract(grib_data, pts)

This is an example of grb2 file : https://drive.google.com/file/d/1euIvEpDP4f4Kqhdnnswba6VjD1i8EvzY/view?usp=sharing

I believe I need to ask an extraction on all my dimensions, but they are not all spatial points (class SpatialVector), so what is the object to create ? Thanks for your help

CodePudding user response:

I'd say data extraction per se from GRIB2 files using {terra} is possible, but results could be more meaningful in this case in my opinion, i.e. come in a more structured way. But this is mainly an issue related to the amount of layers, as you mentioned.

Importing data is the quite self-explanatory:

library(terra)
#> terra 1.6.17

r <- rast("example_file.grb2")
r
#> class       : SpatRaster 
#> dimensions  : 23, 17, 5700  (nrow, ncol, nlyr)
#> resolution  : 0.25, 0.25  (x, y)
#> extent      : 3.375, 7.625, 42.875, 48.625  (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat Coordinate System imported from GRIB file 
#> source      : example_file.grb2 
#> names       : SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], ... 
#> unit        :  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s), ...

As you can see, you got a SpatRaster object with 5,700 (=114*50) layers. So basically your lead times and members are included, but your second variable is not. Unfortunately, the time attribute is also undefined. This would have been quite handy here, in my opinion, if one could subset data based on the nwp run.

names(r) |> unique()
#> [1] "SFC=Ground or water surface; Total precipitation rate [kg/(m^2*s)]"

time(r) |> head(20)
#>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

Unfortunately, this does not seem to be an issue with your special file, but with GRIB2 in general. I also tried importing another forecast dataset and faced the same issues.

However, gdalinfo example_file.grb2 clearly shows that this metadata is available, c.f. attributes GRIB_REF_TIME and GRIB_FORECAST_SECONDS.

This being said, if you knew the exact order of the layers, I'd construct new layer names using names(r) <- c(...) consisting of e.g. a timestamp representing the reference time, followed by the forecast period and the actual member. This way you would be able to subset your raster stack and e.g. extract all relevant layers related to the nwp run originating at "2022-10-18 22:00 UTC" - or something like this.

Extraction of values at point locations works nevertheless - you just need to know the order of your layers to get meaningful results. The resulting data frame consists of 10 rows (= your locations) and 5701 columns (1 ID, 5700 extracted values; 1 per layer).

# generate some points
p <- spatSample(r, 10, as.points = TRUE, values = FALSE)
p
#>  class       : SpatVector 
#>  geometry    : points 
#>  dimensions  : 10, 0  (geometries, attributes)
#>  extent      : 3.75, 6.5, 43.75, 48.5  (xmin, xmax, ymin, ymax)
#>  coord. ref. : lon/lat Coordinate System imported from GRIB file

# extract values
res <- extract(r, p)
dim(res)
#> [1]   10 5701

If you're struggling to create SpatVector points, you could also create simple feature objects using {sf} and convert them afterwards using terra::vect().

CodePudding user response:

I think that what you are after is something along these lines

library(terra)
r <- rast("example_file.grb2")
# There were 50 or more warnings (use warnings() to see the first 50)
e <- ext(3.375, 7.625, 42.875, 48.625)
x <- crop(r, e)
x
#class       : SpatRaster 
#dimensions  : 23, 17, 5700  (nrow, ncol, nlyr)
#resolution  : 0.25, 0.25  (x, y)
#extent      : 3.375, 7.625, 42.875, 48.625  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat Coordinate System imported from GRIB file 
#source      : example_file.grb2 
#names       : SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], SFC=G~2*s)], ... 
#unit        :  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s),  kg/(m^2*s), ... 

And then perhaps

v <- values(x)
# or
v <- as.data.frame(x, xy=TRUE)
  • Related