I was trying to work with dataframe that looks like
home | away | home_score | away_score |
---|---|---|---|
Tampa Bay | Colorado | 3 | 1 |
San Jose | Colombus | 1 | 3 |
New England | San Jose | 1 | 5 |
Colorado | Tampa Bay | 2 | 0 |
New England | KC Wizards | 2 | 1 |
My goal is to compare 'home_score' with 'away_score' and choose the string from 'home' or 'away' to store that value in to separate column based on which score was lower.
For example, for the first row, as away_score is 1 I should be able to add "Colorado" to a separate column.
Desired outcome:
home | away | home_score | away_score | lost_team |
---|---|---|---|---|
Tampa Bay | Colorado | 3 | 1 | Colorado |
I tried to search for the task but I was not successful in finding methods. I would really appreciate the help!
CodePudding user response:
You can use np.where
df['lost_team'] = np.where(df['home_score'] < df['away_score'], df['home'], df['away'])
print(df)
# Output
home away home_score away_score lost_team
0 Tampa Bay Colorado 3 1 Colorado
1 San Jose Colombus 1 3 San Jose
2 New England San Jose 1 5 New England
3 Colorado Tampa Bay 2 0 Tampa Bay
4 New England KC Wizards 2 1 KC Wizards
If a draw is possible, use np.select
:
conds = [df['home_score'] < df['away_score'],
df['home_score'] > df['away_score']]
choices = [df['home'], df['away']]
draw = df[['home', 'away']].agg(list, axis=1)
df['lost_team'] = np.select(condlist=conds, choicelist=choices, default=draw).explode()
df = df.explode('lost_team')
print(df)
# Output
home away home_score away_score lost_team
0 Tampa Bay Colorado 3 1 Colorado
1 San Jose Colombus 1 3 San Jose
2 New England San Jose 1 5 New England
3 Colorado Tampa Bay 2 0 Tampa Bay
4 New England KC Wizards 2 1 KC Wizards
5 Team A Team B 0 0 Team A # Row 1
5 Team A Team B 0 0 Team B # Row 2
CodePudding user response:
You can pandas.DataFrame.apply
with axis=1
to check the condition on each row and save the result:
df['lost_team'] = df.apply(lambda row:
'Equal' if row['home_score'] == row['away_score'] else (
row['away'] if row['home_score'] > row['away_score'] else row['home']),
axis=1)
print(df)
home away home_score away_score lost_team
0 Tampa Bay Colorado 3 1 Colorado
1 San Jose Columbus 1 3 San Jose
2 New England San Jose 1 5 New England
3 Colorado Tampa Bay 2 0 Tampa Bay
4 New England KC Wizards 2 1 KC Wizards
5 Team A Team B 1 1 Equal