Home > database >  Pandas DataFrame stack multiple column values into single column without changing one of the columns
Pandas DataFrame stack multiple column values into single column without changing one of the columns

Time:10-20

I'm trying to reshape a data frame in a certain way.

This is the data frame I have,

col_1 col_2
One Roxanne
Two Ghina
Roxanne Leila
Ghina George
Three Rock
Four Rock

I'd like to reshape the dataframe such that it looks like this:

col_3
Roxanne
Ghina
Leila
George
Rock
Rock
One
Two
Three
Four

How would I go about doing this? (without changing col_2 values)

CodePudding user response:

For expected ouput join both columns with removed matched values from col_1 by col_2:

df = pd.concat([df.col_2, df.loc[~df.col_1.isin(df.col_2), 'col_1']]).to_frame('col_3')
print (df)
     col_3
0  Roxanne
1    Ghina
2    Leila
3   George
4     Rock
5     Rock
0      One
1      Two
4    Three
5     Four

CodePudding user response:

You can use:

out = (df[['col_2', 'col_1']].melt(value_name='col_3')[['col_3']]
       .drop_duplicates(ignore_index=True)
      )

Or:

out = (df.loc[:, ::-1].unstack().reset_index(drop=True).to_frame(name='col_3')
       .drop_duplicates(ignore_index=True)
      )

output:

     col_3
0  Roxanne
1    Ghina
2    Leila
3   George
4     Rock
5      One
6      Two
7    Three
8     Four
  • Related