I'm interested in creating two variables from a time-series of spatial raster data in R from an netcdf file. Opening the data, subsetting by Z and creating a mean value is straight forward:
# Working example below using a ~16mb nc file of sea ice from HadISST
library(R.utils)
library(raster)
library(tidyverse)
download.file("https://www.metoffice.gov.uk/hadobs/hadisst/data/HadISST_ice.nc.gz","HadISST_ice.nc.gz")
gunzip("HadISST_ice.nc.gz", ext="gz", FUN=gzfile)
hadISST <- brick('Datasets/HadISST/HadISST_ice.nc')
# subset to a decade time period and create decadal mean from monthly data
hadISST_a <- hadISST %>% subset(., which(getZ(.) >= as.Date("1900-01-01") & getZ(.) <= as.Date("1909-12-31"))) %>% mean(., na.rm = TRUE)
But, I'm interested in extracting 1) annual mean values, and 2) annual mean of three minimum monthly values for the subsetted time period. My current work flow is using nc_open() and ncvar_get() to open the data, raster::extract() to get the values and then tidverse group_by() and slice_min() to get the annual coolest months, but it's a slow and cpu intensive approach. Is there a more effective way of doing this without converting from raster - data.frame?
Questions:
Using the above code how can I extract annual means rather than a mean of ALL months over the decadal period?
Is there a way of using slice_min(order_by = sst, n = 3) or similar with brick objects to get the minimum three values per year prior to annual averaging?
CodePudding user response:
Example data
if (!file.exists("HadISST_ice.nc")) {
download.file("https://www.metoffice.gov.uk/hadobs/hadisst/data/HadISST_ice.nc.gz","HadISST_ice.nc.gz")
R.utils:::gunzip("HadISST_ice.nc.gz")
}
library(terra)
hadISST <- rast('HadISST_ice.nc')
Annual mean
y <- format(time(hadISST), "%Y")
m <- tapp(hadISST, y, mean)
Mean of the lowest three monthly values by year (this takes much longer because a user-defined R function is used)
f <- function(i) mean(sort(i)[1:3])
m3 <- tapp(hadISST, y, f)
CodePudding user response:
There are likely much more intelligent ways to do this, but a thought as to your annual means, here calling hadISST, hadI:
v <- getValues(hadI)
v_t <- t(v) # this takes a while
v_mean <- vector(mode='numeric')
for(k in 1:nrow(v_t)) {
v_mean[k] = mean(v_t[k, ], na.rm = TRUE)
}
length(v_mean)
[1] 1828
v_mean[1:11]
[1] 0.1651351 0.1593368 0.1600364 0.1890360 0.1931470 0.1995657 0.1982052
[8] 0.1917534 0.1917840 0.1911638 0.1911657
I'm a little unclear on proposition 2, as it seems it would be the average of 3 0's...