frame = pd.Series([4051, 4052, 4053, 4054, 4060, 4061])
heat = pd.Series([52, 51, 50, 52, 53, 50])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat
I have a dataframe df_1
. I want to retrieve a dataframe df_2
which only contains the rows of df_1
, whose increment of Frame
from one row to the next is smaller or equal to 3. If the increment is larger, the search shall stop.
I tried this:
i = 0
df_2 = pd.DataFrame()
for i in df_1['Frame']:
j = i 1
if (df_1['Frame'][j] - df_1['Frame'][i]) > 3:
break
else:
df_2.append(i)
It results in an error. Can you find my mistake? If possible, I would prefer a solution without a loop since loops tend to be slow.
My desired output would be:
frame = pd.Series([4051, 4052, 4053, 4054])
heat = pd.Series([52, 51, 50, 52])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat
CodePudding user response:
Use Series.diff
with compare for greater and mask by Series.cummax
for filtering in boolean indexing
with invert mask by ~
for bitwise NOT
:
df_1 = df_1[~df_1['Frame'].diff().gt(3).cummax()]
print (df_1)
Frame Heat
0 4051 52
1 4052 51
2 4053 50
3 4054 52