Home > Blockchain >  How to remove xarray dimension after adding another without deleting the data variables
How to remove xarray dimension after adding another without deleting the data variables

Time:04-25

I have data from ECMWF which when read into xarray looks like this

Dimensions:     (time: 424, step: 12, latitude: 3, longitude: 2)
Coordinates:
    number      int64 0
  * time        (time) datetime64[ns] 1990-03-01T06:00:00 ... 1993-04-22T18:0...
  * step        (step) timedelta64[ns] 01:00:00 02:00:00 ... 11:00:00 12:00:00
    surface     float64 0.0
  * latitude    (latitude) float64 41.0 40.75 40.5
  * longitude   (longitude) float64 -96.92 -96.67
    valid_time  (time, step) datetime64[ns] 1990-03-01T07:00:00 ... 1993-04-2...
Data variables:
    i10fg       (time, step, latitude, longitude) float32 4.876 4.637 ... 3.959
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2022-04-23T22:18 GRIB to CDM CF via cfgrib-0.9.9...

I've stacked the time and step dimensions to make a new one called forecast. I then added another dimension called valid and set that equal to the coordinate valid_time. Both valid and fcst are the same length, but I now want to drop the fcst dimension. However, when I do this, the data variable also gets deleted. Does anyone know how to fix this? Here is my sample code. There might be a better way to do what I'm doing, but I'm still pretty new to xarray!

ds = ds.stack(fcst=("time", "step")).transpose("fcst", "latitude", "longitude")
ds.expand_dims(valid=ds['valid_time']).drop_dims('fcst')

which leaves me with

<xarray.Dataset>
Dimensions:    (valid: 5088, latitude: 3, longitude: 2)
Coordinates:
  * valid      (valid) datetime64[ns] 1990-03-01T07:00:00 ... 1993-04-23T06:0...
    number     int64 0
    surface    float64 0.0
  * latitude   (latitude) float64 41.0 40.75 40.5
  * longitude  (longitude) float64 -96.92 -96.67
Data variables:
    *empty*

I've tried setting valid as the index then dropping fcst but it still deletes the data variables.

any help would be appreciated!

CodePudding user response:

All variables in an xarray Dataset must be indexed by named dimensions. You can use ds.reset_index` drop any labeled coordinates associated with a dimension, but this isn't what you want. You can't simply do away with a dimension without losing the variables which are indexed with this dimension.

From the xarray docs on data structures:

dimension names are always present in the xarray data model: if you do not provide them, defaults of the form dim_N will be created.

Instead, you can swap a non-indexing coordinate for an indexing coordinate using swap_dims. This will swicth valid_time and fcst in your array, such that valid_time becomes the dimension which indexes fcst and any variables previously indexed by fcst (i10fg in your case).

So the answer is, after the stack but without expand_dims or drop:

ds.swap_dims({'fcst': 'valid_time'}).drop('fcst')
  • Related