The dataframe looks like this:
id1 id2 value
1 1 35
1 1 23
1 1 20
1 2 5
1 2 50
2 1 42
2 1 3
2 1 12
2 2 64
2 3 34
2 3 1
I want to group them by id1 and id2, and remove all rows of a group if the minimum value of that group is not less than 10. So the result would look like this:
id1 id2 value
1 2 5
1 2 50
2 1 3
2 1 12
2 3 34
2 3 1
I have tried this:
dfmin = df.groupby(["id1", "id2"])["value"].min().reset_index()
df = df[
dfmin.loc[
(dfmin["id1"] == df["id1"]) & (dfmin["id1"] == df["id1"]),
"value",
].iat[0]
< 10
]
But I get the error Can only compare identically-labeled Series objects
.
What am I doing wrong and is there a better way?
CodePudding user response:
use groupby filter
out = df.groupby(['id1', 'id2']).filter(lambda x: x['value'].min() < 10)
out
id1 id2 value
3 1 2 5
4 1 2 50
5 2 1 42
6 2 1 3
7 2 1 12
9 2 3 34
10 2 3 1