I want to calculate the minimum value until cond
column is true
Then recalculate the minimum value starting from the next row where the cond
column is true
The obtained result is assigned to the expected
column
input
import pandas as pd
import numpy as np
A=[16,12,21,15,18,19,13,16,10,50]
cond=[False,False,True,False,False,True,False,False,True,False]
df=pd.DataFrame({'A':A,'cond':cond})
df
the expected table
A cond expected
0 16 FALSE
1 12 FALSE
2 21 TRUE 12
3 15 FALSE 12
4 18 FALSE 12
5 19 TRUE 15
6 13 FALSE 15
7 16 FALSE 15
8 10 TRUE 10
9 50 FALSE 10
Index 5 calculates the minimum value from index 3 to index 5
Index 8 calculates the minimum value from index 6 to index 8
CodePudding user response:
Calculate the reverse cumsum
on cond
to identify blocks of rows, then group the column A
by these blocks and transform
with min
to calculate minimum value per block then mask the values and use ffill
to propagate last min values in forward direction
b = df['cond'][::-1].cumsum()
df['result'] = df['A'].groupby(b).transform('min').mask(~df['cond']).ffill()
A cond result
0 16 False NaN
1 12 False NaN
2 21 True 12.0
3 15 False 12.0
4 18 False 12.0
5 19 True 15.0
6 13 False 15.0
7 16 False 15.0
8 10 True 10.0
9 50 False 10.0
CodePudding user response:
Not super familiar with pandas, but I got it to make a python array with the expected values.
A=[16,12,21,15,18,19,13,16,10,50]
cond=[False,False,True,False,False,True,False,False,True,False]
output = []
low = max(A) 1
lowPrint = 0
for i, j in zip(A, cond):
if i < low: low = i
if j:
lowPrint = low
low = max(A) 1
output.append(lowPrint)
As I said, I don't know much about pandas but I assume you can use this to get the values then do as you want with them later.
CodePudding user response:
You can get the desired column by getting the min ms
in each range and doing a forward fill.
arr = np.array([16, 12, 21, 15, 18, 19, 13, 16, 10, 50])
c = np.array([0, 0, 1, 0, 0, 1, 0, 0, 1, 0]).astype(bool)
i = np.flatnonzero(c)
splits = np.split(arr, i 1)[:-1]
ms = [s.min() for s in splits]
arr = arr.astype(float)
arr[~c] = np.nan
arr[c] = ms
df = pd.DataFrame(arr).ffill(); df
output:
0
0 NaN
1 NaN
2 12.0
3 12.0
4 12.0
5 15.0
6 15.0
7 15.0
8 10.0
9 10.0