I have a dataframe:
df = C1 A1. A2. A3. Type
A 1. 5. 2. AG
A 7. 3. 8. SC
And I want to create:
df = C1 A1_AG A1_SC A2_AG A2_SC
A 1. 7. 5. 3
How can it be done?
CodePudding user response:
You can rather use a melt
and transpose
:
(df.melt('Type')
.assign(col=lambda d: d['Type'] '_' d['variable'])
.set_index('col')[['value']].T
)
Output:
col AG_A1 SC_A1 AG_A2 SC_A2 AG_A3 SC_A3
value 1 7 5 3 2 8
with additional columns(s):
(df.melt(['C1', 'Type'])
.assign(col=lambda d: d['Type'] '_' d['variable'])
.pivot(index=['C1'], columns='col', values='value')
.reset_index()
)
Output:
col C1 AG_A1 AG_A2 AG_A3 SC_A1 SC_A2 SC_A3
0 A 1 5 2 7 3 8
CodePudding user response:
Use DataFrame.set_index
with DataFrame.unstack
:
df = df.set_index(['C1','Type']).unstack()
df.columns = df.columns.map(lambda x: f'{x[0]}_{x[1]}')
df = df.reset_index()
print (df)
C1 A1_AG A1_SC A2_AG A2_SC A3_AG A3_SC
0 A 1.0 7.0 5.0 3.0 2.0 8.0
CodePudding user response:
One convenience option with pivot_wider
from pyjanitor
:
# pip install pyjanitor
import pandas as pd
import janitor
(df
.pivot_wider(index=None, names_from='Type')
.bfill()
.drop(1)
.astype(int)
)
A1_AG A1_SC A2_AG A2_SC A3_AG A3_SC
0 1 7 5 3 2 8
Of course, you can skip the convenience function and use pivot
directly:
result = df.pivot(index=None, columns='Type')
result.columns = result.columns.map('_'.join)
result.bfill().drop(1).astype(int)
A1_AG A1_SC A2_AG A2_SC A3_AG A3_SC
0 1 7 5 3 2 8