Home > Enterprise >  Filling parts of a list without a loop
Filling parts of a list without a loop

Time:03-09

I have the following list or numpy array

  ll=[7.2,0,0,0,0,0,6.5,0,0,-8.1,0,0,0,0]

and an additional list indicating the positions of non-zeros

  i=[0,6,9]

I would like to make two new lists out of them, one filling the zeros and one counting in between, for this short example:

  a=[7.2,7.2,7.2,7.2,7.2,7.2,6.5,6.5,6.5,-8.1,-8.1,-8.1,-8.1,-8.1]
  b=[0,1,2,3,4,5,0,1,2,0,1,2,3,4]

Is therea a way to do that without a for loop to speed up things, as the list ll is quite long in my case.

CodePudding user response:

So...

import numpy as np

ll = np.array([7.2, 0, 0, 0, 0, 0, 6.5, 0, 0, -8.1, 0, 0, 0, 0])
i = np.array([0, 6, 9])

counts = np.append(
  np.diff(i),       # difference between each element in i 
                    #    (i element shorter than i)
  len(ll) - i[-1],  #   length of last repeat
)
repeated = np.repeat(ll[i], counts)

repeated becomes

[ 7.2  7.2  7.2  7.2  7.2  7.2  6.5  6.5  6.5 -8.1 -8.1 -8.1 -8.1 -8.1]

b could be computed with

b = np.concatenate([np.arange(c) for c in counts])
print(b)
# [0 1 2 3 4 5 0 1 2 0 1 2 3 4]

but that involves a loop in the form of that list comprehension; perhaps someone Numpyier could implement it without a Python loop.

CodePudding user response:

Array a is the result of a forward fill and array b are indices associated with the range between each consecutive non-zero element.

pandas has a forward fill function, but it should be easy enough to compute with numpy and there are many sources on how to do this.

ll=[7.2,0,0,0,0,0,6.5,0,0,-8.1,0,0,0,0]
a = np.array(ll)

# find zero elements and associated index
mask = a == 0
idx = np.where(~mask, np.arange(mask.size), False)

# do the fill
a[np.maximum.accumulate(idx)]

output:

array([ 7.2,  7.2,  7.2,  7.2,  7.2,  7.2,  6.5,  6.5,  6.5, -8.1, -8.1,
       -8.1, -8.1, -8.1])

More information about forward fill is found here:

Computing array b you could use the forward fill mask and combine it with a single np.arange:

fill_mask = np.maximum.accumulate(idx)
np.arange(len(fill_mask)) - fill_mask

output:

array([0, 1, 2, 3, 4, 5, 0, 1, 2, 0, 1, 2, 3, 4])
  • Related