Home > Software design >  Find a sequence of same value in a dataframe column
Find a sequence of same value in a dataframe column

Time:08-24

I need to check if a condition happened in the previous N rows of a dataframe column with the same value. What is the best way to do that?

Examples:

                              previous N columns  current column
                                   |  |  |  |  |  |
dataframe['condition'] = [F, F, F, T, T, T, T, T, F]
dataframe['operator']  = [F, F, F, F, T, T, F, F, F]
                                      ^  ^

I need to know if in the rows before the current one where condition is (T)rue if there was any operator that is also (T)rue.

In the example above the check would be true. however, if the columns are as below it should be False:

dataframe['condition'] = [F, F, F, T, T, T, T, T, F]
dataframe['operator']  = [F, F, F, F, F, F, F, F, F]

CodePudding user response:

# your sample data
dataframe = pd.DataFrame()
dataframe['condition'] = ['F', 'F', 'F', 'T', 'T', 'T', 'T', 'T', 'F']
dataframe['operator']  = ['F', 'F', 'F', 'F', 'T', 'T', 'F', 'F', 'F']

n = 5 # set the number of rows to look back
current_pos = 8 # the index of the current position
# use .iloc to filter to current position and tail to look back n rows
dataframe.iloc[:current_pos, 1].tail(n).isin(['T']).any() # -> True

CodePudding user response:

IIUC, you need to use "condition" to define the range of rows to check. For this you need a groupby.

What is unclear is the format of the output. I provided both a flag of the whole group and of only the False condition.

# group by stretched of T followed by F
g = (~dataframe['condition'][::-1]).cumsum()

# is there any T among the corresponding operators?
dataframe['result'] = dataframe.groupby(g)['operator'].transform(lambda x: x.iloc[:-1].any())

# flag only last value of group
dataframe['result2'] = dataframe['result'].mask(dataframe['condition'], False)

Output case 1:

   condition  operator  result  result2
0      False     False   False    False
1      False     False   False    False
2      False     False   False    False
3       True     False    True    False
4       True      True    True    False
5       True      True    True    False
6       True     False    True    False
7       True     False    True    False
8      False     False    True     True

Output case 2:

   condition  operator  result  result2
0      False     False   False    False
1      False     False   False    False
2      False     False   False    False
3       True     False   False    False
4       True     False   False    False
5       True     False   False    False
6       True     False   False    False
7       True     False   False    False
8      False     False   False    False

Example with several stretches:

   condition  operator  result  result2
0       True      True    True    False
1      False     False    True     True
2      False     False   False    False
3      False     False   False    False
4       True     False    True    False
5       True     False    True    False
6       True      True    True    False
7       True     False    True    False
8       True     False    True    False
9      False     False    True     True
  • Related