Home > Back-end >  How to filter with conditions to add to new column
How to filter with conditions to add to new column

Time:01-30

I was trying to work with dataframe that looks like

home away home_score away_score
Tampa Bay Colorado 3 1
San Jose Colombus 1 3
New England San Jose 1 5
Colorado Tampa Bay 2 0
New England KC Wizards 2 1

My goal is to compare 'home_score' with 'away_score' and choose the string from 'home' or 'away' to store that value in to separate column based on which score was lower.

For example, for the first row, as away_score is 1 I should be able to add "Colorado" to a separate column.

Desired outcome:

home away home_score away_score lost_team
Tampa Bay Colorado 3 1 Colorado

I tried to search for the task but I was not successful in finding methods. I would really appreciate the help!

CodePudding user response:

You can use np.where

df['lost_team'] = np.where(df['home_score'] < df['away_score'], df['home'], df['away'])
print(df)

# Output
          home        away  home_score  away_score    lost_team
0    Tampa Bay    Colorado           3           1     Colorado
1     San Jose    Colombus           1           3     San Jose
2  New England    San Jose           1           5  New England
3     Colorado   Tampa Bay           2           0    Tampa Bay
4  New England  KC Wizards           2           1   KC Wizards

If a draw is possible, use np.select:

conds = [df['home_score'] < df['away_score'],
         df['home_score'] > df['away_score']]
choices = [df['home'], df['away']]
draw = df[['home', 'away']].agg(list, axis=1)

df['lost_team'] = np.select(condlist=conds, choicelist=choices, default=draw).explode()
df = df.explode('lost_team')
print(df)

# Output
          home        away  home_score  away_score    lost_team
0    Tampa Bay    Colorado           3           1     Colorado
1     San Jose    Colombus           1           3     San Jose
2  New England    San Jose           1           5  New England
3     Colorado   Tampa Bay           2           0    Tampa Bay
4  New England  KC Wizards           2           1   KC Wizards
5       Team A      Team B           0           0       Team A  # Row 1
5       Team A      Team B           0           0       Team B  # Row 2

CodePudding user response:

You can pandas.DataFrame.apply with axis=1 to check the condition on each row and save the result:

df['lost_team'] = df.apply(lambda row: 
                           'Equal' if row['home_score'] == row['away_score'] else (
                           row['away'] if row['home_score'] > row['away_score'] else row['home']), 
                           axis=1)
print(df)

          home        away  home_score  away_score    lost_team
0    Tampa Bay    Colorado           3           1     Colorado
1     San Jose    Columbus           1           3     San Jose
2  New England    San Jose           1           5  New England
3     Colorado   Tampa Bay           2           0    Tampa Bay
4  New England  KC Wizards           2           1   KC Wizards
5       Team A      Team B           1           1        Equal
  • Related