I am very new to pandas dataframes and I have a dataframe similar to this one:
value_1 value_2
1 2
9 6
2 5
7 2
2 5
What I need to do is to only get the rows that have a value_1 greater than 3 for example, but also get the first and last row of the dataframe. Like this:
value_1 value_2
1 2
9 6
7 2
2 5
I know that I can filter by index with iloc and by column value with loc, but how can I use them both together?
Thanks!
CodePudding user response:
Assuming a range index, you could use boolean indexing:
# is value_1 > 3?
m1 = df['value_1'].gt(3)
# is the index the first or last value?
m2 = df.index.isin([0, len(df)-1])
# keep if any condition above is True
out = df[m1|m2]
If you don't have a range index or if you have duplicates, you can use:
m2 = np.r_[True, [False]*len(df-2), True]
output:
value_1 value_2
0 1 2
1 9 6
3 7 2
4 2 5