I have data from ECMWF which when read into xarray looks like this
Dimensions: (time: 424, step: 12, latitude: 3, longitude: 2)
Coordinates:
number int64 0
* time (time) datetime64[ns] 1990-03-01T06:00:00 ... 1993-04-22T18:0...
* step (step) timedelta64[ns] 01:00:00 02:00:00 ... 11:00:00 12:00:00
surface float64 0.0
* latitude (latitude) float64 41.0 40.75 40.5
* longitude (longitude) float64 -96.92 -96.67
valid_time (time, step) datetime64[ns] 1990-03-01T07:00:00 ... 1993-04-2...
Data variables:
i10fg (time, step, latitude, longitude) float32 4.876 4.637 ... 3.959
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: 2022-04-23T22:18 GRIB to CDM CF via cfgrib-0.9.9...
I've stacked the time
and step
dimensions to make a new one called forecast
. I then added another dimension called valid
and set that equal to the coordinate valid_time
. Both valid
and fcst
are the same length, but I now want to drop the fcst
dimension. However, when I do this, the data variable also gets deleted. Does anyone know how to fix this? Here is my sample code. There might be a better way to do what I'm doing, but I'm still pretty new to xarray!
ds = ds.stack(fcst=("time", "step")).transpose("fcst", "latitude", "longitude")
ds.expand_dims(valid=ds['valid_time']).drop_dims('fcst')
which leaves me with
<xarray.Dataset>
Dimensions: (valid: 5088, latitude: 3, longitude: 2)
Coordinates:
* valid (valid) datetime64[ns] 1990-03-01T07:00:00 ... 1993-04-23T06:0...
number int64 0
surface float64 0.0
* latitude (latitude) float64 41.0 40.75 40.5
* longitude (longitude) float64 -96.92 -96.67
Data variables:
*empty*
I've tried setting valid
as the index then dropping fcst
but it still deletes the data variables.
any help would be appreciated!
CodePudding user response:
All variables in an xarray Dataset must be indexed by named dimensions. You can use ds.reset_index` drop any labeled coordinates associated with a dimension, but this isn't what you want. You can't simply do away with a dimension without losing the variables which are indexed with this dimension.
From the xarray docs on data structures:
dimension names are always present in the xarray data model: if you do not provide them, defaults of the form dim_N will be created.
Instead, you can swap a non-indexing coordinate for an indexing coordinate using swap_dims
. This will swicth valid_time
and fcst
in your array, such that valid_time
becomes the dimension which indexes fcst
and any variables previously indexed by fcst
(i10fg
in your case).
So the answer is, after the stack but without expand_dims
or drop
:
ds.swap_dims({'fcst': 'valid_time'}).drop('fcst')