Home > OS >  How to slice dataframe based on increment of value in a column?
How to slice dataframe based on increment of value in a column?

Time:04-09

frame = pd.Series([4051, 4052, 4053, 4054, 4060, 4061])
heat = pd.Series([52, 51, 50, 52, 53, 50])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat

I have a dataframe df_1. I want to retrieve a dataframe df_2 which only contains the rows of df_1, whose increment of Frame from one row to the next is smaller or equal to 3. If the increment is larger, the search shall stop. I tried this:

i = 0
df_2 = pd.DataFrame()

for i in df_1['Frame']:
    j = i 1
    if (df_1['Frame'][j] - df_1['Frame'][i]) > 3:
        break
    else: 
        df_2.append(i)

It results in an error. Can you find my mistake? If possible, I would prefer a solution without a loop since loops tend to be slow.

My desired output would be:

frame = pd.Series([4051, 4052, 4053, 4054])
heat = pd.Series([52, 51, 50, 52])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat

CodePudding user response:

Use Series.diff with compare for greater and mask by Series.cummax for filtering in boolean indexing with invert mask by ~ for bitwise NOT:

df_1 = df_1[~df_1['Frame'].diff().gt(3).cummax()]
print (df_1)
   Frame  Heat
0   4051    52
1   4052    51
2   4053    50
3   4054    52
  • Related