I have a dataframe on Pandas, which contains information from FC Barcelona soccer matches and results.
df['Match Result']
contains info such as '2:0'.
I already did the split and I have two columns for the results of both teams.
I named them ['Left']
and ['Right']
, in int values.
Now, if Barcelona is playing away, the goal count for the team will be shown in ['Right']
. Else (if playing home), it will be shown on ['Left']
. The new int column df['Barcelona_result']
must contain the int values from ['Left']
or ['Right']
depending if ['Location']
value (string) is 'Home' or 'Away'
So, I tried something like this:
df['Barcelona_result']= 0
df['Barcelona_result'] = df['Barcelona_result'].astype('int')
for i in df['Location']
if i == "Home":
df.Barcelona_result = df.Left
else: df.Barcelona_result = df.Right
break
The home results are OK but the away results are not. It's always taking the int values from ['Left']
. Any advice would be appreciated. Thanks in advance!
CodePudding user response:
I would probably do it like this:
#split the result
res = df['Match Result'].astype(str).str.split(":")
#default Home
df['Barcelona_result'] = res.str[0]
#mask with second split if Location not "Home"
df['Barcelona_result'] = df['Barcelona_result'].mask(df['Location]!="Home", res.str[1])
#convert to int
df['Barcelona_result'] = df['Barcelona_result'].astype(int)
This code is under the assumption that you always have results. If you have missing or misformatted results it might err.