I have the following dataframe, and I need to obtain only the values from Column 2 that go from 21 to 1, without losing the sorting from Column 1
Column1 | Column2 | Value |
---|---|---|
44508 | 20 | A |
44508 | 20 | B |
44509 | 21 | B |
44510 | 22 | C |
44511 | 1 | A |
44511 | 1 | B |
44512 | 2 | A |
This was my initial approach:
df= df.sort_values(by='Column1')
df = df.loc[(df['Column2'] >= 21) & (df['Column2'] <= 1)]
But it obviusly didn't work as there won't be any value higher than 21 and lower than 1 And I am trying to obtain this:
Column1 | Column2 | Value |
---|---|---|
44509 | 21 | B |
44510 | 22 | C |
44511 | 1 | A |
I also thought about first sorting the whole dataframe, and then choose from the row where it is first found the 21 to the last time it is found the 1, but I could not anything about this approach, and I don't know if there might be a simpler solution to it.
PD: The Column 2 is a value generated based on column 1; I understand that a solution would be to just loc by column1 but I need it to be based on Column 2
CodePudding user response:
Change your condition
df = df.loc[~((dfmerged['Column2'] < 21) & (df['Column2'] > 1))]
CodePudding user response:
You could use loc
to slice the DataFrame with idxmax
to get the index of the first occurrence of 1:
out = df.loc[df['Column2'].eq(21).idxmax() : df['Column2'].eq(1).idxmax()]
Output:
Column1 Column2 Value
2 44509 21 B
3 44510 22 C
4 44511 1 A