So, i have a dataframe like this:
I want to groupby based on Field1, if count is greater than 2, then find max Field2 and create a new field and set it to True.
I tried
import pandas as pd
pd.read_csv("c:/test.csv")
df["Field3"] = df.groupby(["Field1"])["Field2"].transform("max")
But it didn't work.
CodePudding user response:
We have to do an additional transform
to check the counts per Field1
which are greater than 2
g = df.groupby("Field1")["Field2"]
df['Field3'] = g.transform('count').gt(2) & df['Field2'].eq(g.transform('max'))
Alternatively you can also use the single transform
with lambda function to check for the conditions but this might be slower that the first approach on larger dataframes
df['Field3'] = df.groupby("Field1")["Field2"].transform(
lambda s: (s == max(s)) * (len(s) > 2))
Field1 Field2 Field3
0 a 3 False
1 a 5 True
2 a 3 False
3 b 2 False
4 c 1 False
5 b 6 False