Home > Software design >  How to optimize turning a group of wide pandas columns into two long pandas columns
How to optimize turning a group of wide pandas columns into two long pandas columns

Time:12-25

I have a process that takes a dataframe and turns a set of wide pandas columns into two long pandas columns, like so:

original wide:

wide = pd.DataFrame(
    {
        'id':['foo'],
        'a':[1],
        'b':[2],
        'c':[3],
        'x':[4],
        'y':[5],
        'z':[6]
    }
)

wide

    id  a   b   c   x   y   z
0   foo 1   2   3   4   5   6

desired long:

lon = pd.DataFrame(
    {
        'id':['foo','foo','foo','foo','foo','foo'],
        'type':['a','b','c','x','y','z'],
        'val':[1,2,3,4,5,6]
    }
)

lon

    id   type   val
0   foo   a      1
1   foo   b      2
2   foo   c      3
3   foo   x      4
4   foo   y      5
5   foo   z      6

I found out a way to do this by chaining the following pandas assignments

(wide
 .set_index('id')
 .T
 .unstack()
 .reset_index()
 .rename(columns={'level_1':'type',0:'val'})
 )

    id  type   val
0   foo   a     1
1   foo   b     2
2   foo   c     3
3   foo   x     4
4   foo   y     5
5   foo   z     6

But when I scale my data this seems to be posing issues for me. I was just looking for an alternative solution to what I have already accomplished that is perhaps faster/more computationally efficient.

CodePudding user response:

I think you are looking for the pandas melt function.

Assuming your original dataframe is called wide, then:

df = pd.melt(wide, id_vars="id")
df.columns = ['id', 'type', 'val']
print(df)

Output:

    id type  val
0  foo    a    1
1  foo    b    2
2  foo    c    3
3  foo    x    4
4  foo    y    5
5  foo    z    6
  • Related