I am trying to determine the result of a football match based on the scored points. If the amount of goals and scored and received are equal the expected output should be a draw. if the amount of scored goals is higher then the goals received is then the expected output should be a win. If the amount of the goals scored are lower the goals received are the same the output should be lost.
Football_data_match['result'] = if(Football_data_match['goal_scored'] > Football_data_match['goal_against']:
Football_data_match['result'] = 'win'
elif (Football_data_match['goal_scored'<Football_data_match['goal_against']:
Football_data_match['result'] 'lost'
else:
Football_data_match['result'] = 'draw')
The code above gives a syntax error but I'm not able to pinpoint the exact mistake. Could somebody help me fix this problem.
CodePudding user response:
One way is using np.select
:
import numpy as np
import pandas as pd
# Example data
df = pd.DataFrame({
"goal_scored": np.random.randint(4, size=12),
"goal_against": np.random.randint(4, size=12)
})
df["result"] = np.select(
[
df["goal_scored"] < df["goal_against"],
df["goal_scored"] == df["goal_against"],
df["goal_scored"] > df["goal_against"]
], ["lost", "draw", "win"]
)
df:
goal_scored goal_against result
0 1 3 lost
1 0 1 lost
2 0 3 lost
3 3 2 win
4 1 3 lost
5 2 0 win
6 2 2 draw
7 2 2 draw
8 3 1 win
9 0 2 lost
10 2 3 lost
11 1 1 draw
CodePudding user response:
You can also use DataFrame.apply
:
import pandas as pd
import numpy as np
import itertools
teams = ['Arizona Cardinals', 'Atlanta Falcons', 'Baltimore Ravens', 'Buffalo Bills', 'Carolina Panthers', 'Chicago Bears']
k = pd.DataFrame(np.random.randint(20,high=30, size=(15,2)), index=itertools.combinations(teams, 2), columns=['goal_scored', 'goal_against'])
k['result'] = k.apply(lambda row: 'win' if row['goal_scored'] > row['goal_against'] else ('lost' if row['goal_scored'] < row['goal_against'] else 'draw'), axis=1)
k
is:
goal_scored goal_against result
(Arizona Cardinals, Atlanta Falcons) 29 29 draw
(Arizona Cardinals, Baltimore Ravens) 20 26 lost
(Arizona Cardinals, Buffalo Bills) 21 24 lost
(Arizona Cardinals, Carolina Panthers) 20 25 lost
(Arizona Cardinals, Chicago Bears) 27 28 lost
(Atlanta Falcons, Baltimore Ravens) 26 24 win
(Atlanta Falcons, Buffalo Bills) 20 21 lost
(Atlanta Falcons, Carolina Panthers) 22 25 lost
(Atlanta Falcons, Chicago Bears) 26 22 win
(Baltimore Ravens, Buffalo Bills) 23 21 win
(Baltimore Ravens, Carolina Panthers) 29 22 win
(Baltimore Ravens, Chicago Bears) 21 27 lost
(Buffalo Bills, Carolina Panthers) 24 21 win
(Buffalo Bills, Chicago Bears) 28 26 win
(Carolina Panthers, Chicago Bears) 24 22 win
Your problem is that you need to think vectorized when using pandas
. Your if...else...
operates on scalars, when Football_data_match
is a whole DataFrame
.
You need to start with the DataFrame
or numpy.ndarray
.