Home > other >  Remove groups from dataframe where a min value within that group is not below a threshold
Remove groups from dataframe where a min value within that group is not below a threshold

Time:12-20

The dataframe looks like this:

id1  id2  value
1    1    35
1    1    23
1    1    20
1    2    5
1    2    50
2    1    42
2    1    3
2    1    12
2    2    64
2    3    34
2    3    1

I want to group them by id1 and id2, and remove all rows of a group if the minimum value of that group is not less than 10. So the result would look like this:

id1  id2  value
1    2    5
1    2    50
2    1    3
2    1    12
2    3    34
2    3    1

I have tried this:

dfmin = df.groupby(["id1", "id2"])["value"].min().reset_index()

df = df[
    dfmin.loc[
        (dfmin["id1"] == df["id1"]) & (dfmin["id1"] == df["id1"]),
        "value",
    ].iat[0]
    < 10
]

But I get the error Can only compare identically-labeled Series objects.

What am I doing wrong and is there a better way?

CodePudding user response:

use groupby filter

out = df.groupby(['id1', 'id2']).filter(lambda x: x['value'].min() < 10)

out

   id1  id2 value
3   1   2   5
4   1   2   50
5   2   1   42
6   2   1   3
7   2   1   12
9   2   3   34
10  2   3   1
  • Related