How would you do this in numpy, without using a for
-loop or while
-loop, and if it is possible, using the fastest way to do it in terms of execution time.
It is something like "if the value in an specific column and row is higher than 20 assign the previous value".
(I'm using the column 2 for this case)
a = np.random.randint(1,50, size=(10,5))
print(a)
print('--------------')
for index in range(len(a)):
if (a[index][2] > 20) and index>0:
a[index][2]=a[index-1][2]
print(a)
[[21 24 | 7| 10 12]
[ 4 36 |42| 4 48]
[37 43 |36| 30 13]
[39 10 |45| 45 15]
[46 48 |25| 39 20]
[ 2 33 |37| 38 28]
[23 36 |14| 29 33]
[17 24 |47| 4 9]
[45 9 |42| 34 3]
[13 49 |26| 14 34]]
--------------
[[21 24 | 7| 10 12]
[ 4 36 | 7| 4 48]
[37 43 | 7| 30 13]
[39 10 | 7| 45 15]
[46 48 | 7| 39 20]
[ 2 33 | 7| 38 28]
[23 36 |14| 29 33]
[17 24 |14| 4 9]
[45 9 |14| 34 3]
[13 49 |14| 14 34]]
CodePudding user response:
Since values are repetitive, you can use np.repeat
arr = a[:, 2]
cutoff_idx = np.flatnonzero(arr<=20)
vals = arr[cutoff_idx]
counts = np.diff(cutoff_idx, append=len(arr))
if len(cutoff_idx) > 0:
n = cutoff_idx[0]
a[:n, 2] = arr[:n] #this is a custom approach since there's no info
a[n:, 2] = np.repeat(vals, counts)
CodePudding user response:
Try:
spec_col = a[:, 2]
cond = np.where(a[:, 2] < 20)[0]
diff = np.diff(cond, append=len(spec_col))
cond_vals = np.take(spec_col, cond)
repeated_arr = np.repeat(cond_vals, diff)
spec_col[np.where(a[:, 2] < 20)[0][0]:] = repeated_arr