I have a list of N 2d numpy arrays, all of the same size Mx3, all of which represent a single sample of M coordinates. Sometimes the value of a coordinate can be np.nan.
I (think I) know how to compute the average and standard deviation over these coordinate samples, namely as follows (i.e. stack them and compute the average and std along axis=0):
averaged = np.nanmean(np.array(listOf2dArrays, dtype=np.float64), axis=0)
std = np.nanstd( np.array(listOf2dArrays, dtype=np.float64), axis=0)
How can I determine the count on which the average and std values for each coordinate are based i.e. the number of non-nan values for each coordinate(component) m?
The result should be a 2d-array of dimensions Mx3 containing non-nan-count values.
CodePudding user response:
You can make a mask array and just sum 0s and 1s, where 0 either means a real value or nan value.
Basically, let us assume you have a 3D array with some random nan-values and let us count the number of nans along some axis (in your case the axis 0):
#!/usr/bin/env ipython
# ----------------------------
import numpy as np
nx,ny,nz=50,50,50;npts=nx*ny*nz
n_nans=100
# ----------------------------
# Let us make a random array with number of NaN values n_nans
A=np.random.random((nz,ny,nz))
i_nan=np.array(np.random.random((n_nans))*npts,dtype='int32')
dum = A.flatten();
dum[i_nan] = np.nan;
A=np.reshape(dum,np.shape(A))
# ----------------------------
A_nanmask=np.zeros(np.shape(A));A_nanmask[np.isnan(A)]=1;
A_notnanmask=np.ones(np.shape(A));A_notnanmask[np.isnan(A)]=0;
# -----------------------------
# Get some count of nans:
A_notnan = np.sum(A_notnanmask,axis=0)
A_nan = np.sum(A_nanmask,axis=0)
# Check the sums:
A_total = A_notnan A_nan
So, the A_total should be 50 in this example.
CodePudding user response:
You can do it the same way using count_nonzero
(which is preferred over sum
):
np.count_nonzero(~np.isnan(np.array(listOf2dArrays, dtype=np.float64)), axis=0)