I had a dataframe with 2 categorical features like this:
Then use get_dummies to one hot vector these columns:
Now I want to get back from one hot vector to first columns, actually a reverse action of get_dummies. Is there any way to do this?
CodePudding user response:
Use DataFrame.melt
for unpivot with filter 1
in DataFrame.query
, then splitting varible
column and reshape by DataFrame.set_index
with Series.unstack
:
df = pd.get_dummies(df1.astype(str))
df = df.melt(ignore_index=False).query('value == 1')
df[['a','b']] = df['variable'].str.rsplit('_', n=1, expand=True)
df = df.set_index('a', append=True)['b'].unstack().rename_axis(None, axis=1)
Or use DataFrame.stack
with filter in Series.loc
, convert multiIndex
to DataFrame
by MultiIndex.to_frame
, splitting and pivoting by DataFrame.pivot
:
df = df.stack().loc[lambda x: x.eq(1)].index.to_frame()
df[['a','b']] = df[1].str.rsplit('_', n=1, expand=True)
df = df.pivot(0,'a','b').rename_axis(index=None, columns=None)
CodePudding user response:
Use from_dummies
(pandas 1.5 ):
df_original = pd.from_dummies(df_dummies, sep='_')
Output:
EstateTypes AdverTypes
0 1 1
1 1 2
2 1 2
3 1 3
4 1 2