Home > Blockchain >  Mark True from conditions satisfy on two consecutive values till another two consecutive values
Mark True from conditions satisfy on two consecutive values till another two consecutive values

Time:03-08

I have a float column in a dataframe. And I want to add another boolean column which will be True if condition satisfies on two consecutive values till another condition satisfies on next two consecutive values.

For Example I have a data-frame which look like this:

index Values %
0 0
1 5
2 11
3 9
4 14
5 18
6 30
7 54
8 73
9 100
10 100
11 100
12 100
13 100

Now I want to mark True from where two consecutive values satisfies the condition df['Values %'] >= 10 till next two consecutive values satisfies the next condition i.e. df[Values %] == 100.

So the final result will look like something this:

index Values % Flag
0 0 False
1 5 False
2 11 False
3 9 False
4 14 False
5 18 True
6 30 True
7 54 True
8 73 True
9 100 True
10 100 True
11 100 False
12 100 False
13 100 False

CodePudding user response:

Not sure how exactly the second part of your question is supposed to work but here is how to achieve the first.

example data

s = pd.Series([0,5,11,9,14,18,2,14,16,18])

solution

# create true/false series for first condition and take cumulative sum
x = (s >= 10).cumsum()

# compare each element of x with 2 elements before.  There will be a difference of 2 for elements which belong to streak of 2 or more True
condition = x - x.shift(2) == 2

condition looks like this

0    False
1    False
2    False
3    False
4    False
5     True
6    False
7    False
8     True
9     True
dtype: bool

CodePudding user response:

I have a rather inefficient way of doing this. It's not vectorised, so not ideal, but it works:

# Convert the values column to a 1D NumPy array for ease of use.
values = df["Values %"].tolist()
values_np = np.array(values)

# Initialize flags 1D array to be the same size as values_np. Initially set to all 0s. Uses int form of booleans, i.e. 0 = False and 1 = True.
flags = np.zeros((values_np.shape[0]), dtype=int)

# Iterate from 1st (not 0th) row to last row.
for i in range(1, values_np.shape[0]):
    # First set flag to 1 (True) if meets the condition that consecutive values are both >= 10.
    if values_np[i] >= 10 and values_np[i-1] >= 10:
        flags[i] = 1

    # Then if consecutive values are both larger than 100, set flag to 0 (False).
    if values_np[i] >= 100 and values_np[i-1] >= 100:
        flags[i] = 0

# Turn flags into boolean form (i.e. convert 0 and 1 to False and True).
flags = flags.astype(bool)

# Add flags as a new column in df.
df["Flags"] = flags

One thing -- my method gives False for row 10, because both row 9 and row 10 >= 100. If this is not what you wanted, let me know and I can change it so that the flag is True only if the previous two values and the current value (3 consecutive values) are all >= 100.

  • Related