Home > OS >  Unstacking repeated column in Dataframe
Unstacking repeated column in Dataframe

Time:06-07

Given the example dataframe:

   Field Value
0    a     A
1    b     1
2    c     3
3    d     2 
4    e     3
5    f     1
...

Where the Field column repeats in groups of 6 rows (a->f). How can I unstack the dataframe using pandas?

i.e.

  a b c d e f
0 A 1 3 2 3 1
1 ...
.
.
.

I could naively do this by iteration and appending to a new dataframe. Any helpful functions to do this in pandas?

example_df = pd.DataFrame({'Field':['a','b','c','d','e','f'], 'Value':[1,2,3,4,5,6]})

CodePudding user response:

Seems like you just need a pivot_table

>>> df.pivot_table(columns='Field', values='Value', aggfunc='first')

Field  a  b  c  d  e  f
Value  A  1  3  2  3  1

CodePudding user response:

If the field values always repeat from a->f, then you can simply reshape the Value column:

c = list('abcdef')
pd.DataFrame(df['Value'].values.reshape(-1, len(c)), columns=c)

Alternative more robust approach with unstack

i = df['Field'].eq('a').cumsum()
df.set_index([i, 'Field'])['Value'].unstack()

   a  b  c  d  e  f
0  1  2  3  4  5  6

CodePudding user response:

Use pivot_table with aggfunc=list and then explode reset_index:

tmp = df.pivot_table(columns='Field', values='Value', aggfunc=list)
tmp = tmp.explode(tmp.columns.tolist()).reset_index(drop=True).rename_axis(None, axis=1)

Output:

>>> tmp
   a  b  c  d  e  f
0  1  2  3  4  5  6
1  1  2  3  4  5  6

(Above output generated using your dataframe doubled.)

CodePudding user response:

You can create an idx column to pivot on then drop it:

(example_df.assign(idx=example_df.Field.eq('a').cumsum())
 .pivot('idx', 'Field', 'Value').reset_index(drop = True))


Field   a   b   c   d   e   f
0       1   2   3   4   5   6
1       7   8   9  10  11  12
2      13  14  15  16  17  18

Data

example_df = pd.DataFrame({'Field':list('abcdef'*3), 'Value':range(1,19)})
  • Related