I have a dataframe of football matches, like so:
team_id adversary_id round_id xG
262 263 1 0.45
263 262 1 0.34
245 254 1 0.67
254 245 1 0.15
...
How do I get rid of repeated fixtures and change dataframe into this wider format:
team_id adversary_id round_id xG_team xG_adversary
262 263 1 0.45 0.34
245 254 1 0.67 0.15
...
CodePudding user response:
Try with numpy
sort
then merge
df[['team_id','adversary_id']] = np.sort(df[['team_id','adversary_id']].values,axis=1)
out = df.iloc[0::2].merge(df.iloc[1::2],on = ['team_id','adversary_id','round_id'], suffixes = ('_team','_adversary'))
Out[403]:
team_id adversary_id round_id xG_team xG_adversary
0 262 263 1 0.45 0.34
1 245 254 1 0.67 0.15