Home > other >  create a adjacency matrix with football player
create a adjacency matrix with football player

Time:12-21

I have two dataframes

One is called player and contains name of football players

player= ["David Gonzalez","Agustin Martinez","Jibrail Al-Hindi","Edward Cahill","Simon Becker","Paolo Imperiali","Amir Bahari","Guilherme Souza"]

player = pd.DataFrame(player)

I have another dataframe called football

id scorer
1 David Gonzalez, Edward Cahill
2 Agustin Martinez,Brian McNamara
3 Agustin Martinez, Jibrail Al-Hindi
4 Edward Cahill,Guilherme Souza
5 Paolo Imperiali, Yannick Wagner
6 Simon Becker,Amir Bahari
7 Paolo Imperiali,Yannick Wagner
8 Amir Bahari,Guilherme Souza,David Gonzalez
9 Edward Cahill,Amir Bahari
10 Simon Becker
11 Amir Bahari
12 Paolo Imperiali,Simon Becker
13 Edward Cahill,Guilherme Souza
14 Edward Cahill,Amir Bahari
15 Simon Becker
16 Simon Becker

the second dataframe called football shows, which players scored in which game.

Now I would like to create a adjacency matrix, which shows rows and columns of all players from dataframe player, with 1 if there is a game id were both have scored together, and 0 if they don't have a game which they scored together.

I did this.

np.zeros((player,scorer)

But I think I am in the wrong path, because I want a matrix which the columns and rows give the names of the player in player and have 1 or 0 as numbers

CodePudding user response:

You can split/explode and join the players for a crosstab:

s = football['scorer'].str.split(',\s*').explode().loc[lambda s: s.isin(player[0])]
df2 = s.rename('row').to_frame().join(s.rename('col'))

out = pd.crosstab(df2['row'], df2['col']).rename_axis(index=None, columns=None)

NB. you get the number of goals in common, if you just want 0/1, add .clip(upper=1).

Output:

                  Agustin Martinez  Amir Bahari  David Gonzalez  Edward Cahill  Guilherme Souza  Jibrail Al-Hindi  Paolo Imperiali  Simon Becker
Agustin Martinez                 2            0               0              0                0                 1                0             0
Amir Bahari                      0            5               1              2                1                 0                0             1
David Gonzalez                   0            1               2              1                1                 0                0             0
Edward Cahill                    0            2               1              5                2                 0                0             0
Guilherme Souza                  0            1               1              2                3                 0                0             0
Jibrail Al-Hindi                 1            0               0              0                0                 1                0             0
Paolo Imperiali                  0            0               0              0                0                 0                3             1
Simon Becker                     0            1               0              0                0                 0                1             5
  • Related