I would like to ffill nan values in a numpy array using the last non-nan values repeating N times. If the number of nan values > N, then fill the rest nan values with zero. How do I do it in pure numpy without iteration?
import numpy as np
n = 2
arr = np.array([np.nan, 0, 0, np.nan, 5, 4, 4, np.nan, np.nan, np.nan, 1, 5, 3, np.nan, 2, np.nan, np.nan])
def ffill(arr: np.array, n: int):
pass
return arr
result = np.array([0.0, 0.0, 0.0, 0.0, 5.0, 4.0, 4.0, 4.0, 4.0, 0.0, 1.0, 5.0, 3.0, 3.0, 2.0, 2.0, 2.0])
Ffill 4 n times (=2) [... 4, np.nan, np.nan, np.nan ...] -> [... 4, 4, 4, 0 ...]
CodePudding user response:
Here is a trick that works:
- Fix the start value
if math.isnan(arr[0]):
arr[0] = 0
- Now we can keep track of the valid indices with np.cumsum
isnan = np.isnan(arr)
notnan = ~isnan
valid = arr[notnan]
indices = np.cumsum(notnan) - 1
arr = valid[indices]
- To support your requirement that NaNs are replaced with zeros after N steps, you could use
np.convolve(isnan, (1,) * (n 1), mode='same') > n
to find the indices. But because convolve is centered, it's a bit complicated to find the correct index from the convolution. Let's do it manually instead. Yes, this will use an iteration, but only a fixed number for N
overlimit = np.copy(isnan[n:])
for i in range(1, n 1):
overlimit &= isnan[n-i:-i]
indices = np.flatnonzero(overlimit) n
arr[indices] = 0