Home > Software engineering >  Take previous row with conditional in numpy
Take previous row with conditional in numpy

Time:11-28

How would you do this in numpy, without using a for-loop or while-loop, and if it is possible, using the fastest way to do it in terms of execution time.

It is something like "if the value in an specific column and row is higher than 20 assign the previous value".

(I'm using the column 2 for this case)

a = np.random.randint(1,50, size=(10,5))

print(a)
print('--------------')

for index in range(len(a)):
       
    if (a[index][2] > 20) and index>0:
        
        a[index][2]=a[index-1][2]

print(a)


[[21 24 | 7| 10 12]
 [ 4 36 |42|  4 48]
 [37 43 |36| 30 13]
 [39 10 |45| 45 15]
 [46 48 |25| 39 20]
 [ 2 33 |37| 38 28]
 [23 36 |14| 29 33]
 [17 24 |47|  4  9]
 [45  9 |42| 34  3]
 [13 49 |26| 14 34]]
--------------
[[21 24 | 7| 10 12]
 [ 4 36 | 7|  4 48]
 [37 43 | 7| 30 13]
 [39 10 | 7| 45 15]
 [46 48 | 7| 39 20]
 [ 2 33 | 7| 38 28]
 [23 36 |14| 29 33]
 [17 24 |14|  4  9]
 [45  9 |14| 34  3]
 [13 49 |14| 14 34]]

CodePudding user response:

Since values are repetitive, you can use np.repeat

arr = a[:, 2]

cutoff_idx = np.flatnonzero(arr<=20)
vals = arr[cutoff_idx]
counts = np.diff(cutoff_idx, append=len(arr))

if len(cutoff_idx) > 0:
    n = cutoff_idx[0]
    a[:n, 2] = arr[:n] #this is a custom approach since there's no info
    a[n:, 2] = np.repeat(vals, counts)

CodePudding user response:

Try:

spec_col = a[:, 2]
cond = np.where(a[:, 2] < 20)[0]
diff = np.diff(cond, append=len(spec_col))
cond_vals = np.take(spec_col, cond)
repeated_arr = np.repeat(cond_vals, diff)
spec_col[np.where(a[:, 2] < 20)[0][0]:] = repeated_arr
  • Related