I have an array in this form data[values,longitude,latitude] where the size is data[21000,12,13]. The data are daily temperature values for around 50 years in NetCDF format, for an area of 12x13 grids.
I want to extract in a new table the sum of the values for each lon and lat that are greater than 10 and less than 20. I am new in Python so I stacked at the first steps. Here is what I have and of course there is an error:
file = netCDF4.Dataset ('/mnt/data/rcp45/tas/merged_rcp45/rcp45_Celsius.nc')
lat = file.variables['lat'][:]
lon = file.variables['lon'][:]
data = file.variables['tas'][:]
for i in range(21000):
if (data[i,:,:]) > 10 and (data[i,:,:]) < 20:
x = sum(data[i,:,:])
The expected outcome: x[sum_values_conditionally,longitude,latitude] So I want a table that for the 21000 timesteps and for each grid point (lon-lat), will calculate the sum only of values that matches the conditions. E.g. for the first grid point: x[230,1,1], where 230 is the some of the values > 10 and <20.
The error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
the initial dataset looks like this:
this is the lat values - print(lat)
this is the lon values - print(lon)
CodePudding user response:
A possible solution to your question is:
file = netCDF4.Dataset ('/mnt/data/rcp45/tas/merged_rcp45/rcp45_Celsius.nc')
lat = file.variables['lat'][:]
lon = file.variables['lon'][:]
data = file.variables['tas'][:]
summed_values = np.zeros((12, 13))
for grid_index, grid in enumerate(data):
for lon_index, lon_column in enumerate(grid):
for lat_index, value in enumerate(lon_column):
if 10 < value and value < 20:
summed_values[lon_index][lat_index] = value
n_years = data.shape[0] // 365
summed_values_per_year = np.zeros((n_years 1, 12, 13))
for grid_index, grid in enumerate(data):
for lon_index, lon_column in enumerate(grid):
for lat_index, value in enumerate(lon_column):
if 10 < value and value < 20:
year = grid_index // 365
summed_values_per_year [year][lon_index][lat_index] = value