I have two arrays (A and B) containing either values ore nan's.
For calculating the average, I sum both up and divide by two.
A:
array([ nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, 109.93013333, 121.27613333,
131.6136 , 142.32926667, 148.2544 , 156.32266667,
160.3568 , 164.39093333, 168.6772 , 165.2734 ,
165.77766667, 163.0042 , 164.8952 , 157.83546667,
145.48093333, 162.89614286, 163.13026667, 151.53213333,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan])
B:
array([ nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
127.39813333, 141.14986667, 152.5664 , 160.99906667,
169.04253333, 173.45346667, 179.29146667, 179.55093333,
180.1996 , 178.51306667, 182.40506667, 173.06426667,
158.27466667, 163.0748 , 140.76066667, 120.00333333,
82.5104 , nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan])
avg:
array([ nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
129.50586667, 141.73956667, 150.4104 , 158.66086667,
164.69966667, 168.9222 , 173.98433333, 172.41216667,
172.98863333, 170.75863333, 173.65013333, 165.44986667,
151.8778 , 162.98547143, 151.94546667, 135.76773333,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan])
Apparently, the average is calculated only at indices in both arrays with non-Nan values.
But: How to consider value with are either only in A or B present?
CodePudding user response:
You have two options:
- Use
numpy.nan_to_num
. this approach convertnp.nan
to zero then(nan 20)/2 = 10
- Use
numpy.nanmean((A,B), axis=0)
(Doc). this approach skipnp.nan
as num and compute average then(nan 20)/2 = 20
(In this appraoch, we get a warning if we have and want to compute(nan nan)/2
)
# 1
>>> (np.nan_to_num(A) np.nan_to_num(B))/2
array([ 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. , 54.96506667, 60.63806666,
129.50586666, 141.73956667, 150.4104 , 158.66086667,
164.69966666, 168.9222 , 173.98433334, 172.41216667,
172.98863334, 170.75863334, 173.65013333, 165.44986667,
151.8778 , 162.98547143, 151.94546667, 135.76773333,
41.2552 , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. ,
0. ])
# 2
>>> np.nanmean((A,B), axis=0)
array([ nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, 109.93013333, 121.27613333,
129.50586666, 141.73956667, 150.4104 , 158.66086667,
164.69966666, 168.9222 , 173.98433334, 172.41216667,
172.98863334, 170.75863334, 173.65013333, 165.44986667,
151.8778 , 162.98547143, 151.94546667, 135.76773333,
82.5104 , nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan, nan, nan, nan,
nan])