Given the example dataframe:
Field Value
0 a A
1 b 1
2 c 3
3 d 2
4 e 3
5 f 1
...
Where the Field column repeats in groups of 6 rows (a->f). How can I unstack the dataframe using pandas?
i.e.
a b c d e f
0 A 1 3 2 3 1
1 ...
.
.
.
I could naively do this by iteration and appending to a new dataframe. Any helpful functions to do this in pandas?
example_df = pd.DataFrame({'Field':['a','b','c','d','e','f'], 'Value':[1,2,3,4,5,6]})
CodePudding user response:
Seems like you just need a pivot_table
>>> df.pivot_table(columns='Field', values='Value', aggfunc='first')
Field a b c d e f
Value A 1 3 2 3 1
CodePudding user response:
If the field values always repeat from a->f
, then you can simply reshape
the Value
column:
c = list('abcdef')
pd.DataFrame(df['Value'].values.reshape(-1, len(c)), columns=c)
Alternative more robust approach with unstack
i = df['Field'].eq('a').cumsum()
df.set_index([i, 'Field'])['Value'].unstack()
a b c d e f
0 1 2 3 4 5 6
CodePudding user response:
Use pivot_table
with aggfunc=list
and then explode
reset_index
:
tmp = df.pivot_table(columns='Field', values='Value', aggfunc=list)
tmp = tmp.explode(tmp.columns.tolist()).reset_index(drop=True).rename_axis(None, axis=1)
Output:
>>> tmp
a b c d e f
0 1 2 3 4 5 6
1 1 2 3 4 5 6
(Above output generated using your dataframe doubled.)
CodePudding user response:
You can create an idx
column to pivot on then drop it:
(example_df.assign(idx=example_df.Field.eq('a').cumsum())
.pivot('idx', 'Field', 'Value').reset_index(drop = True))
Field a b c d e f
0 1 2 3 4 5 6
1 7 8 9 10 11 12
2 13 14 15 16 17 18
Data
example_df = pd.DataFrame({'Field':list('abcdef'*3), 'Value':range(1,19)})