Home > Software engineering >  Pandas - how to save attributes into an array and preserve their values into new columns
Pandas - how to save attributes into an array and preserve their values into new columns

Time:12-09

I have a dataframe that has the attributes "key", "name", and "value". For each "key" group, I want the attributes "name" to be stored in an array. The "value" of each name would then be stored into it's own column named "<name>_value". For example:

  key    name    value
0   A  Patton        2
1   A  Arthur        2
2   B    Will        1
3   B  Patton        1

The dataframe above should be transformed as such:

  key              name    Patton_value    Arthur_value    Will_value
0   A  [Patton, Arthur]               2               2          null
1   B    [Patton, Will]               1            null             1

What I have attempted so far is:

df.groupby('key').name.apply(list).reset_index()

But I don't know how to preserve the values for the "name" and save them as a column.

CodePudding user response:

There might be more concise ways to do it, but this works:

D = df.pivot(columns='name', index='key', values='value').reset_index().rename_axis(None, axis=1)
D['name'] = D.apply(lambda x: df['name'][df['key'] == x['key']].values, axis=1).reset_index(drop=True)
D = D[[D.columns[0], D.columns[-1], *D.columns[1:-1]]]
D.columns = [*D.columns[:2], *D.columns[2:]   '_value']

Output:

>>> D
  key              name  Arthur_value  Patton_value  Will_value
0   A  [Patton, Arthur]           2.0           2.0         NaN
1   B    [Will, Patton]           NaN           1.0         1.0

CodePudding user response:

You can create a name_list column and pivot the dataframe:

df['name_list'] = df['key'].map(df.groupby('key')['name'].agg(tuple)) 
out = df.pivot_table(values='value', index=['key','name_list'], columns='name').add_suffix('_value').reset_index().rename({'name_list':'name'}, axis=1)
out['name'] = out['name'].apply(lambda x: list(x))
out.columns.name = None

Output:

  key              name  Arthur_value  Patton_value  Will_value
0   A  [Patton, Arthur]           2.0           2.0         NaN
1   B    [Will, Patton]           NaN           1.0         1.0
  • Related