Home > Software engineering >  Find the first occurrence that matches multiple conditions in a dataframe
Find the first occurrence that matches multiple conditions in a dataframe

Time:06-29

I have a general Date and Values Dataframe and I'm trying to use it create another more filtered one with certain ranges.

I'm having a hard time trying to find the first time a Value is either equal or greater than the start Value of the range after the Start Date.

general Dataframe:

             Value
Date               
2022-06-24     1302
2022-06-23     942
2022-06-22     346
2022-06-21     912
2022-06-17     245
2022-06-16     762
2022-06-15     899
2022-06-14     927
2022-06-13     234
2022-06-10     955
2022-06-09     1372

For example, in this case, I want to find in my general Dataframe, the first date, where a value gets equal or greater than the Start Value (927), but after the Start Date (2022-06-14), so, 2022-06-23. (That is the first date where the value (942) is greater than the Start Value and happened after the range Start Date).

Start Date   End Date  Start Value  End Value  First recurrence 
2022-06-14 2022-06-17          927        245        2022-06-23

Edit[1]: The ranges are made in other part of the code(Start Date, End Date, Start Value, End Value), I'm only interested on the First recurrence calculation.

PS. The way the ranges are structured the "First recurrence" will always be outside the range anyway. So no need to care about the End Date.

I've tried many things but so far I've not been able to filter for both Date and Value with my Filtered DataFrame having a different length than the general one.

CodePudding user response:

This should do what you've asked:

Method 1:

x = df[(df.index > df2['Start Date'].iloc[0]) & (df.Value > df2['Start Value'].iloc[0])]
df2['First recurrence'] = x.index.min()

Input:

df:
            Value
Date
2022-06-24   1302
2022-06-23    942
2022-06-22    346
2022-06-21    912
2022-06-17    245
2022-06-16    762
2022-06-15    899
2022-06-14    927
2022-06-13    234
2022-06-10    955
2022-06-09   1372

df2:
  Start Date   End Date  Start Value  End Value
0 2022-06-14 2022-06-17          927        245

Output:

  Start Date   End Date  Start Value  End Value First recurrence
0 2022-06-14 2022-06-17          927        245       2022-06-23

Method 2:

s = df2.squeeze()
x = df[(df.index > s['Start Date']) & (df.Value > s['Start Value'])]
df2['First recurrence'] = x.index.min()
  • Related