Home > Back-end >  How to transform multiple dummy variables to string
How to transform multiple dummy variables to string

Time:10-12

So I have data set like this:

    a_1 a_2 a_3
0   1   1   0
1   0   1   0
2   0   1   1
3   1   0   0

And I want to cut in to one column so it looks like it:

    a
0   1|2
1   2  
2   2|3
3   1  

This way I can then easily redo it back to dummy if I need it later.

So how to do it? pandas.from_dummies - works only for one choice cases so maybe there is a fast way to do it.

CodePudding user response:

One option if you have a single level (a):

(df.mask(df.eq(0))
   .stack()
   .reset_index(-1)['level_1']
   .str.extract('_(\w )', expand=False)
   .groupby(level=0).agg('|'.join)
)

Quick hack alternative:

df.dot('|' df.columns.str.extract('_(\w )', expand=False)).str[1:]

output:

0    1|2
1      2
2    2|3
3      1
Name: level_1, dtype: object

For more levels:

(df
 .mask(df.eq(0))
 .set_axis(df.columns.str.split('_', expand=True), axis=1)
 .stack([0, 1])
 .reset_index(-1)['level_2']
 .groupby(level=[0, 1]).agg('|'.join)
 .unstack()
)

output:

     a
0  1|2
1    2
2  2|3
3    1

CodePudding user response:

converting row data to list such [1 ,1 ,0] and find out the index number with 1, so output will be [0, 1] as you start index from 1, I have incremented the list elemnts with 1. last converted list to string using join fun with | character.

Code:

df.apply(lambda x: '|'.join([str(i 1) for i, e in enumerate(list(x)) if e == 1]),axis=1)

Ouput:

0    1|2
1      2
2    2|3
3      1
dtype: object
  • Related