Home > Mobile >  How to define a column of a dataframe using conditions based on its own previous values?
How to define a column of a dataframe using conditions based on its own previous values?

Time:12-07

Say I have a data-frame, df as below, that has a 'Value' column which I'd like to apply some boolean analysis too.

date     Value 
10/11    0.798
11/11    1.235
12/11    0.890
13/11    0.756
14/11    0.501
...

Essentially, I'd like to create a new column that switches to TRUE when the value is greater than 1, and remains true unless the value drops below 0.75. For example, it would look like the below using df:

column
FALSE
TRUE
TRUE
TRUE
FALSE

I am struggling to find an appropriate way to reference the previous value of a column I am defining without running into some error. The logic I want to use is as below:

df['column'] = (df['value'] >= 1) | ((df['column'].shift(1) == True) & (df['value'] >= 0.75))

Is there a way that I can achieve this without over-complicating things?

CodePudding user response:

A possible solution:

val1, val2 = 1, 0.75

out = (df.assign(
    new=df.Value.where(df.Value.gt(val1) | df.Value.lt(val2))
    .ffill().gt(val1)))

print(out)

Output:

    date  Value    new
0  10/11  0.798  False
1  11/11  1.235   True
2  12/11  0.890   True
3  13/11  0.756   True
4  14/11  0.501  False

CodePudding user response:

Actually calling function with apply might help, with some "remembering" logic.

res = True

def CheckRow(row):
    global res
    if res == True:
        if row['value']>1.0:
            res = False #next time check for < 0.75
            return True
        else:
            return False
    else: #res == False
        if row['value']<0.75:
            res = True #next time check for above 1.0
            return False
        else:
            return True

df['column'] = df.apply(lambda x: CheckRow(x), axis = 1)
  • Related