I have a pandas DataFrame df
as follow :
siren ratio
1 20
2 25
1 40
3 16
3 19
4 35
My goal is to have a df2
with only siren whom ratio value is above 30 at least one time as follow :
siren ratio
1 20
1 40
4 35
Today, I do it in two steps :
First, I use a filter to get all the uniques siren with a value above 30 :
value_30 = df[df["ratio"] > 30]["siren"].unique()
Then, I use value_30 as a list in order to filter my df, and to get my df2.
However, I'm not satisfied with this solution and I think there are a most pythonic way to do this. Any idea ?
CodePudding user response:
Try with groupby
and transform
:
value_30 = df[df.groupby("siren")["ratio"].transform("max")>30]
>>> value_30
siren ratio
0 1 20
2 1 40
5 4 35
CodePudding user response:
Use groupby.filter
res = df.groupby(df.siren).filter(lambda x: x["ratio"].max() > 30)
print(res)
Output
siren ratio
0 1 20
2 1 40
5 4 35
CodePudding user response:
df[df['ratio'].gt(30).groupby(df['siren']).transform('max')]
siren ratio
0 1 20
2 1 40
5 4 35