I have two datasets, the first one is a high spatial resolution, and its values are 0 and 1, and the second dataset has coarse spatial resolution data (its values are not important in my case).
I would like to count the number of gridpoints from the high-resolution data which are closest to the gridpoints of the coarse-resolution data, where the values of the high-resolution data are 1.
In other words, count the number of high-resolution gridpoints with the value of 1, that fall within the pixels of the coarse-resolution data.
Example of the data for coarse spatial resolution data
lon = [ 176.25, 176.75, 177.25, 177.75, 178.25, 178.75, 179.25, 179.75]
lat = [-87.25, -87.75, -88.25, -88.75, -89.25, -89.75]
temperature = np.random.rand(6, 8)
coarse_res = xr.DataArray(temperature, coords={'lat': lat,'lon': lon}, dims=["lat", "lon"])
Example of the data for high spatial resolution data
lon = [176.125,176.375,176.625,176.875,177.125,177.375,177.625,177.875,178.125,178.375,178.625,178.875,179.125,179.375,179.625,179.875]
lat = [-87.125, -87.375, -87.625, -87.875, -88.125, -88.375, -88.625, -88.875, -89.125, -89.375, -89.625, -89.875]
ds_2 = np.random.randint(0, 2, size=(12, 16))
high_res = xr.DataArray(ds_2, coords={'lat': lat,'lon': lon}, dims=["lat", "lon"])
In the end, I would like to calculate the fraction of the high_res gridpoints/pixels with the value of 1 surrounding the coarse-resolution gridpoint. For example, if the first gridpoint of the coarse_res
data is surrounded by 4 high-res
gridpoints and these values are 0, 1, 1, 1
the fraction should be 0.75.
CodePudding user response:
You can do this with xr.Dataset.groupby_bins
:
low_lon_edges = np.arange(176., 178.001, 0.5)
low_lat_edges = np.arange(-90, -86.9, 0.5)
low_lon_centers = (low_lon_edges[:-1] low_lon_edges[1:]) / 2
low_lat_centers = (low_lat_edges[:-1] low_lat_edges[1:]) / 2
aggregated = (
high_res
.groupby_bins('lon', bins=low_lon_edges, labels=low_lon_centers)
.sum(dim="lon")
.groupby_bins('lat', bins=low_lat_edges, labels=low_lat_centers)
.sum(dim="lat")
)
Additionally, if the cells nest perfectly (it looks like you're dealing with 1/4 and 1/2 degree data which are both centered on the half cell, so this should work fine) you can just use xr.Dataset.coarsen
:
aggregated = ds.coarsen(lat=2, lon=2, boundary="exact").sum()