So I have data set like this:
a_1 a_2 a_3
0 1 1 0
1 0 1 0
2 0 1 1
3 1 0 0
And I want to cut in to one column so it looks like it:
a
0 1|2
1 2
2 2|3
3 1
This way I can then easily redo it back to dummy if I need it later.
So how to do it? pandas.from_dummies - works only for one choice cases so maybe there is a fast way to do it.
CodePudding user response:
One option if you have a single level (a):
(df.mask(df.eq(0))
.stack()
.reset_index(-1)['level_1']
.str.extract('_(\w )', expand=False)
.groupby(level=0).agg('|'.join)
)
Quick hack alternative:
df.dot('|' df.columns.str.extract('_(\w )', expand=False)).str[1:]
output:
0 1|2
1 2
2 2|3
3 1
Name: level_1, dtype: object
For more levels:
(df
.mask(df.eq(0))
.set_axis(df.columns.str.split('_', expand=True), axis=1)
.stack([0, 1])
.reset_index(-1)['level_2']
.groupby(level=[0, 1]).agg('|'.join)
.unstack()
)
output:
a
0 1|2
1 2
2 2|3
3 1
CodePudding user response:
converting row data to list such [1 ,1 ,0]
and find out the index number with 1
, so output will be [0, 1]
as you start index from 1
, I have incremented the list elemnts with 1
. last converted list to string using join fun with |
character.
Code:
df.apply(lambda x: '|'.join([str(i 1) for i, e in enumerate(list(x)) if e == 1]),axis=1)
Ouput:
0 1|2
1 2
2 2|3
3 1
dtype: object