Home > Enterprise >  How to consider values from arrays and do averaging when array has nan values?
How to consider values from arrays and do averaging when array has nan values?

Time:06-28

I have two arrays (A and B) containing either values ore nan's.

For calculating the average, I sum both up and divide by two.

A:

array([         nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan, 109.93013333, 121.27613333,
       131.6136    , 142.32926667, 148.2544    , 156.32266667,
       160.3568    , 164.39093333, 168.6772    , 165.2734    ,
       165.77766667, 163.0042    , 164.8952    , 157.83546667,
       145.48093333, 162.89614286, 163.13026667, 151.53213333,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan])

B:

array([         nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
       127.39813333, 141.14986667, 152.5664    , 160.99906667,
       169.04253333, 173.45346667, 179.29146667, 179.55093333,
       180.1996    , 178.51306667, 182.40506667, 173.06426667,
       158.27466667, 163.0748    , 140.76066667, 120.00333333,
        82.5104    ,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan])

avg:

array([         nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
           129.50586667, 141.73956667, 150.4104    , 158.66086667,
           164.69966667, 168.9222    , 173.98433333, 172.41216667,
           172.98863333, 170.75863333, 173.65013333, 165.44986667,
           151.8778    , 162.98547143, 151.94546667, 135.76773333,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan])

Apparently, the average is calculated only at indices in both arrays with non-Nan values.

But: How to consider value with are either only in A or B present?

CodePudding user response:

You have two options:

  1. Use numpy.nan_to_num. this approach convert np.nan to zero then (nan 20)/2 = 10
  2. Use numpy.nanmean((A,B), axis=0) (Doc). this approach skip np.nan as num and compute average then (nan 20)/2 = 20 (In this appraoch, we get a warning if we have and want to compute (nan nan)/2)
# 1
>>> (np.nan_to_num(A) np.nan_to_num(B))/2
array([  0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,  54.96506667,  60.63806666,
       129.50586666, 141.73956667, 150.4104    , 158.66086667,
       164.69966666, 168.9222    , 173.98433334, 172.41216667,
       172.98863334, 170.75863334, 173.65013333, 165.44986667,
       151.8778    , 162.98547143, 151.94546667, 135.76773333,
        41.2552    ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ])

# 2
>>> np.nanmean((A,B), axis=0)
array([         nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan, 109.93013333, 121.27613333,
       129.50586666, 141.73956667, 150.4104    , 158.66086667,
       164.69966666, 168.9222    , 173.98433334, 172.41216667,
       172.98863334, 170.75863334, 173.65013333, 165.44986667,
       151.8778    , 162.98547143, 151.94546667, 135.76773333,
        82.5104    ,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan])
  • Related