I have a dataframe as shown below
name skill score percentage
messi attack 160 80
messi fitness 10 5
messi pass 30 15
neymar attack 48 60
neymar fitness 20 25
neymar pass 12 15
ronaldo attack 60 60
ronaldo fitness 30 30
ronaldo pass 10 10
casilas attack 10 25
casilas fitness 20 50
casilas pass 10 25
from the above dataframe I would like to filter name
where attack
score
more than 50
and attack
percentage
more than 50
.
Eexpected output:
name skill score percentage
messi attack 160 80
messi fitness 10 5
messi pass 30 15
ronaldo attack 60 60
ronaldo fitness 30 30
ronaldo pass 10 10
CodePudding user response:
One way to do it
import io
str_data="""
name,skill,score,percentage
messi,attack,160,80
messi,fitness,10,5
messi,pass,30,15
neymar,attack,48,60
neymar,fitness,20,25
neymar,pass,12,15
ronaldo,attack,60,60
ronaldo,fitness,30,30
ronaldo,pass,10,10
casilas,attack,10,25
casilas,fitness,20,50
casilas,pass,10,25
"""
df = pd.read_csv(io.StringIO(str_data))
def filt_player(player_df):
player_df = player_df.set_index('skill')
filters = (
player_df.loc['attack','score'] > 50,
player_df.loc['attack','percentage'] > 50,
)
return all(filters)
filt_df = df.groupby('name').filter(filt_player)
filt_df
CodePudding user response:
You don't need groupby, you can use a boolean mask
mask = df['score'].gt(50) & df['percentage'].gt(50)
out = df[df['name'].isin(df.loc[mask, 'name'])]
print(out)
name skill score percentage
0 messi attack 160 80
1 messi fitness 10 5
2 messi pass 30 15
6 ronaldo attack 60 60
7 ronaldo fitness 30 30
8 ronaldo pass 10 10