Home > Back-end >  Enrich Pandas dataframe with list data
Enrich Pandas dataframe with list data

Time:01-30

Example: I have the following Dataframe:

           0    1   2   3
Teams
                
Lakers     0   15  10  53
Warriors  92  100  36  45
Celtics   32   66  67  57
Rockets   96   18  15  98
Bucks     66   85   2  69
Nets      16   84  61  83
Clippers  59    8   1  54
Jazz      56   52  33  86
Heat      78   90  88  81
Suns      45   16  29   3
Bulls     52   57  77  18

to recreate that dataframe in your notebook you can use this code:

import pandas as pd

data = {'Teams': {0: 'Lakers',
  1: 'Warriors',
  2: 'Celtics',
  3: 'Rockets',
  4: 'Bucks',
  5: 'Nets',
  6: 'Clippers',
  7: '76ers',
  8: 'Jazz',
  9: 'Heat',
  10: 'Suns',
  11: 'Bulls'},
 0: {0: 0, 1: 92, 2: 32, 3: 96, 4: 66, 5: 16, 6: 59, 7: 93, 8: 56, 9: 78, 10: 
45, 11: 52},
 1: {0: 15, 1: 100, 2: 66, 3: 18, 4: 85, 5: 84, 6: 8, 7: 99, 8: 52, 9: 90, 10: 
16, 11: 57},
 2: {0: 10, 1: 36, 2: 67, 3: 15, 4: 2, 5: 61, 6: 1, 7: 54, 8: 33, 9: 88, 10: 29, 
11: 77},
 3: {0: 53, 1: 45, 2: 57, 3: 98, 4: 69, 5: 83, 6: 54, 7: 51, 8: 86, 9: 81, 10: 
3, 11: 18}}

df = pd.DataFrame.from_dict(data)

and the the lists for each NBA team with some random letters:

Lakers = ['U', 'G', 'O', 'Q', 'A']
Warriors = ['X', 'P', 'E', 'S', 'O']
Celtics = ['U', 'T', 'F', 'H', 'Q']
Rockets = ['V', 'C', 'Z', 'T', 'G']
Bucks = ['M', 'P', 'V', 'C', 'O']
Nets = ['V', 'K', 'Q', 'D', 'M']
Clippers = ['U', 'B', 'C', 'Z', 'R']
Jazz = ['I', 'S', 'C', 'L', 'T']
Heat = ['M', 'A', 'A', 'Q', 'F']
Suns = ['Z', 'S', 'F', 'L', 'O']
Bulls = ['W', 'C', 'T', 'P', 'E']

My intention is to write the code which creates the column 4 and includes the last member from the corresponding list to the same team. So in the column 4 'Lakers' should receive 'A', Bucks - 'O', Clippers - 'R' etc...

code:

n = 0
for _ in range(len(df)):
    for t in team_list:
        if t == df.iloc[n, 0]:
            df['4'] = t[-1]
            n  = 1

but seems it does not work, all the fields in the new column 4 are filed with the same letter.

CodePudding user response:

you can create a dictionnairy of all the equipe and use the apply transformation to take the results, but attention the team 76ers does not have a list.

transform = ({
"Lakers" : Lakers,
"Warriors" : Warriors,
"Celtics" : Celtics,
"Rockets" : Rockets,
"Bucks" : Bucks,
"Nets": Nets,
"Clippers" : Clippers,
"Jazz": Jazz,
"Heat": Heat,
"Suns": Suns,
"Bulls" :Bulls
})
df[4] = df.Teams.apply(lambda x: transform.get(x,[None])[-1])

CodePudding user response:

You can use a mapping dict:

# Find all individual variables
dmap = {team: globals().get(team, ['-'])[-1] for team in df['Teams']}
df[4] = df['Teams'].map(dmap)
print(df)

# Output
       Teams   0    1   2   3  4
0     Lakers   0   15  10  53  A
1   Warriors  92  100  36  45  O
2    Celtics  32   66  67  57  Q
3    Rockets  96   18  15  98  G
4      Bucks  66   85   2  69  O
5       Nets  16   84  61  83  M
6   Clippers  59    8   1  54  R
7      76ers  93   99  54  51  -  # Not found in global variables
8       Jazz  56   52  33  86  T
9       Heat  78   90  88  81  F
10      Suns  45   16  29   3  O
11     Bulls  52   57  77  18  E

CodePudding user response:

i think that the issue with the code is that you are overwriting the value of df['4'] in each iteration, instead of updating the specific row. Try this instead:

team_list = [Lakers, Warriors, Celtics, Rockets, Bucks, Nets, Clippers, Jazz, Heat, Suns, Bulls]

for i, team in enumerate(team_list): df.at[i, '4'] = team[-1]

it will add the last letter of each list to the corresponding team in the dataframe.

  • Related