Home > Mobile >  Empty column contents (dictionaries) to new columns
Empty column contents (dictionaries) to new columns

Time:01-21

I have the following dataframe:

data = {'player1':[{'y': -3.52297115326,
  'x': -8.8917736553,
  'group_name': 'home team',
  'track_id': 2},{'y': -3.48258808681,
  'x': -8.85969284603,
  'group_name': 'home team',
  'track_id': 2},{'y': -3.45125194958,
  'x': -8.83364221028,
  'group_name': 'home team',
  'track_id': 2}],
     'player2':[{'y': -3.52297115326,
  'x': -8.8917736553,
  'group_name': 'home team',
  'track_id': 2},{'y': -3.48258808681,
  'x': -8.85969284603,
  'group_name': 'home team',
  'track_id': 2},{'y': -3.45125194958,
  'x': -8.83364221028,
  'group_name': 'home team',
  'track_id': 2}]}
df = pd.DataFrame(data)

I would like to empty the contents of each column to create new columns to produce an output like this:

data2 = {'y': [-3.52291,-3.482588,-3.451252],'x':[-8.891774,-8.859693,-8.833642],'group_name':['home_team','home_team','home_team'],'track_id':[2,2,2],'player2_y': [-3.52291,-3.482588,-3.451252],'player2_x':[-8.891774,-8.859693,-8.833642],'player2_group_name':['home_team','home_team','home_team'],'player2_track_id':[2,2,2]}

df2 = pd.DataFrame(data2)

y   x   group_name  track_id    player2_y   player2_x   player2_group_name  player2_track_id
0   -3.522910   -8.891774   home_team   2   -3.522910   -8.891774   home_team   2
1   -3.482588   -8.859693   home_team   2   -3.482588   -8.859693   home_team   2
2   -3.451252   -8.833642   home_team   2   -3.451252   -8.833642   home_team   2

I can use:

df = pd.concat([df.drop(['player1'], axis=1), df['player1'].apply(pd.Series)], axis=1)
df = pd.concat([df.drop(['player2'], axis=1), df2['player2'].apply(pd.Series)], axis=1)

But the problem is that with a lot of columns and a lot of data this is very slow and inefficient. Does anyone know of a better, more efficient way to do this

Many thanks

CodePudding user response:

Use concat with DataFrame.add_prefix in list comprehension:

df1 = pd.concat([pd.DataFrame(df[x].tolist()).add_prefix(f'{k}_') for x in df.columns], axis=1)
print (df1)
   player1_y  player1_x player1_group_name  player1_track_id  player2_y  \
0  -3.522971  -8.891774          home team                 2  -3.522971   
1  -3.482588  -8.859693          home team                 2  -3.482588   
2  -3.451252  -8.833642          home team                 2  -3.451252   

   player2_x player2_group_name  player2_track_id  
0  -8.891774          home team                 2  
1  -8.859693          home team                 2  
2  -8.833642          home team                 2  
  • Related