Home > Software design >  How to ffill nan values in a numpy array using the last non-nan values repeating N times
How to ffill nan values in a numpy array using the last non-nan values repeating N times

Time:11-24

I would like to ffill nan values in a numpy array using the last non-nan values repeating N times. If the number of nan values > N, then fill the rest nan values with zero. How do I do it in pure numpy without iteration?

import numpy as np

n = 2
arr = np.array([np.nan, 0, 0, np.nan, 5, 4, 4, np.nan, np.nan, np.nan, 1, 5, 3, np.nan, 2, np.nan, np.nan])

def ffill(arr: np.array, n: int):
    pass
    return arr

result = np.array([0.0, 0.0, 0.0, 0.0, 5.0, 4.0, 4.0, 4.0, 4.0, 0.0, 1.0, 5.0, 3.0, 3.0, 2.0, 2.0, 2.0])

Ffill 4 n times (=2) [... 4, np.nan, np.nan, np.nan ...] -> [... 4, 4, 4, 0 ...]

CodePudding user response:

Here is a trick that works:

  1. Fix the start value
if math.isnan(arr[0]):
    arr[0] = 0
  1. Now we can keep track of the valid indices with np.cumsum
isnan = np.isnan(arr)
notnan = ~isnan
valid = arr[notnan]
indices = np.cumsum(notnan) - 1
arr = valid[indices]
  1. To support your requirement that NaNs are replaced with zeros after N steps, you could use np.convolve(isnan, (1,) * (n 1), mode='same') > n to find the indices. But because convolve is centered, it's a bit complicated to find the correct index from the convolution. Let's do it manually instead. Yes, this will use an iteration, but only a fixed number for N
overlimit = np.copy(isnan[n:])
for i in range(1, n   1):
    overlimit &= isnan[n-i:-i]
indices = np.flatnonzero(overlimit)   n
arr[indices] = 0
  • Related