This is my dataframe:
import pandas as pd
df = pd.DataFrame({'a': [1, 0, 1, 0, 1], 'b':[0, 0, 0, 1 ,0]})
I want to select rows is the df
that the value in a
is greater than the value in b
in previus row so I used shift(1)
:
df_shift = df.loc[df.a > df.b.shift(1)]
This gives me row 2
but I want to always preserve the first row since there is no value to compare before row 0
.
This is the result that I want:
a b
0 1 0
2 1 0
I have read these two posts: post_1, post_2. And I have tried the following code which gives me the result that I want but obviously it has been achieved by using concat
.
df_concat = pd.concat([df.iloc[[0]], df_shift], 0)
Is there another way to do it?
CodePudding user response:
here is one way to do it, just check that the when previous b value is null, it gets selected
df.loc[(df.a > df.b.shift(1)) | df.b.shift(1).isnull()]
a b
0 1 0
2 1 0