Multi-level list of rasters - performing algebra in R-CodePudding

I have a multi level list of rasters and it always gives me a headache how to operate on such structures efficiently. I want to take averages from the rasters in 3rd level of the list, going by the indexes. For example:

# make some dummy rasters
a <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
a[] <- sample(1:5,25,replace=T)
b <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
b[] <- sample(1:5,25,replace=T)
c <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
c[] <- sample(1:5,25,replace=T)
d <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
d[] <- sample(1:5,25,replace=T)
e <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
e[] <- sample(1:5,25,replace=T)
f <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
f[] <- sample(1:5,25,replace=T)
g <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
g[] <- sample(1:5,25,replace=T)
h <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
h[] <- sample(1:5,25,replace=T)
i <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
i[] <- sample(1:5,25,replace=T)
j <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
j[] <- sample(1:5,25,replace=T)
k <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
k[] <- sample(1:5,25,replace=T)
l <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
l[] <- sample(1:5,25,replace=T)

)

list_abcd <- list(a,b,c,d)
list_efgh <- list(e,f,g,h)
list_ijkl <- list(i,j,k,l)

list_all <- list(list_abcd,list_efgh,list_ijkl)

I want to perform calculations on the rasters to obtain the averages of the respective 1st rasters in the lists (average of a,e and i), 2nd (b,f and j), 3rd (c,g,k) and 4th (d, h and l).

The result will be a list containing 4 rasters.

In reality, I have a list structured like this for 10 groups, each with over 100 rasters, so the solution needs to be scalable.

Additional info I work with NetCDF files and their structure looks following (below). I create multi-level lists to handle it, but then have a headache how to handle it well without too many laborious for loops.

3 variables:
    double var1[lon,lat,years,irr]
    double var2[lon,lat,years,irr]
    double var3[lon,lat,years,irr]
 4 dimensions:
    lon  Size:720
        units: degrees_east
        long_name: lon
    lat  Size:360
        units: degrees_north
        long_name: lat
    years  Size:100
        units: mapping
        long_name: years
    irr  Size:2
        units: mapping
        long_name: rainfed/irrigated

CodePudding user response：

Your example data

library(raster)
r <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
set.seed(123)
list_abcd <- replicate(4, setValues(r, sample(1:5,25,replace=T)))
list_efgh <- replicate(4, setValues(r, sample(1:5,25,replace=T)))
list_ijkl <- replicate(4, setValues(r, sample(1:5,25,replace=T)))   
list_all <- list(list_abcd,list_efgh,list_ijkl)
names(list_all) <- c("abcd", "efgh", "ijkl")

Solution, create a single RasterStack and use stackApply

x <- stack(unlist(list_all))
i <- rep(1:4, 3)
s <- stackApply(x, i, mean)
s
#class      : RasterBrick 
#dimensions : 5, 5, 25, 4  (nrow, ncol, ncell, nlayers)
#resolution : 1, 1  (x, y)
#extent     : 0, 5, 0, 5  (xmin, xmax, ymin, ymax)
#crs        :  proj=longlat  datum=WGS84  no_defs 
#source     : memory
#names      :  index_1,  index_2,  index_3,  index_4 
#min values : 1.333333, 2.000000, 1.000000, 1.666667 
#max values : 4.666667, 5.000000, 4.666667, 4.333333

Now let's do this with terra (the replacement of raster). We start with three SpatRasters (similar to a RasterStack)

library(terra)
rr <- rast(xmin=0,xmax=5,ymin=0,ymax=5,res=1)
set.seed(123)
abcd <- replicate(4, setValues(rr, sample(1:5,25,replace=T))) |> rast()
efgh <- replicate(4, setValues(rr, sample(1:5,25,replace=T))) |> rast()
ijkl <- replicate(4, setValues(rr, sample(1:5,25,replace=T))) |> rast()
# the above is equivalent to: abcd <- c(a,b,c,d)

Solution 1: Combine into a single SpatRaster and use tapp

rall <- c(abcd, efgh, ijkl) 
z1 <- tapp(rall, rep(1:4, 3), "mean")

Solution 2: Combine into a SpatRasterDataset and use app

rsd <- sds(abcd, efgh, ijkl) 
z2 <- app(rsd, "mean")

Later: from your comment I gather the mistake you make is indeed in creating these lists. If I understand you well you have n files representing models, each with 2*100 layers; and you want to average the values of each layer (year & water status) across models; so that you end up with a single raster dataset of 200 layers (or 2 datasets with 100 layers). You can achieve that with something like this:

 library(terra)
 ff <- list.files(pattern="nc$")  
 sd <- sds(ff)
 x <- app(sd, mean)

Or, if irrigated/rainfed are separate sub-datasets, something like

 library(terra)
 ff <- list.files(pattern="nc$")  
 sd <- lapply(ff, \(i) rast(i, "irrigated")) |> sds()
 # or sd <- lapply(ff, \(i) rast(i, 1)) |> sds()
 x <- app(sd, mean)

With some example data

f <- system.file("ex/logo.tif", package="terra")
sd <- sds(c(f,f,f))
x <- app(sd, mean)