Data:
df = pd.DataFrame({'name':['Jane','Jane','Mike','Mike','Jane','Jane','Jane',
'Mike','Mike','Jane','Jane','Jane'],
'ctg':['A','P','C','B','B','C','B','E','G','L','M','X']})
expected output:
name | ctg |
---|---|
Jane | A |
Jane | B |
Jane | L |
I am new in python and i want to make new Dataframe which includes only the first row of every 'Jane' name. could you please anyone help me ?
CodePudding user response:
You can use GroupBy.first
on a custom group with a mask:
mask = df['name'].eq('Jane')
out = (df[mask] # keep only Jane
# group by consecutive names
.groupby(df['name'].ne(df['name'].shift()).cumsum(), as_index=False)
.first() # first row of each group
)
output:
name ctg
0 Jane A
1 Jane B
2 Jane L