Home > other >  Reformatting a dataframe to replace repeating similar rows with a new column
Reformatting a dataframe to replace repeating similar rows with a new column

Time:08-15

Input.

Name Phrase number Words said
John Phrase 1 Hi!
John Phrase 2 How are you?
John Phrase 3 Is everything okay?
Brad Phrase 1 Hello!
Brad Phrase 2 I am good!
Brad Phrase 3 How are you?

Desired output.

Name Phrase 1 Phrase 2 Phrase 3
John Hi! How are you? Is everything okay?
Brad Hello! I am good! How are you?

How would you solve this with Pandas?

CodePudding user response:

you can use pivot but have to use a few other methods to clean up the index and columns names (in order to exactly match the desired output):

df = (df.pivot(index='Name', columns='Phrase number')
      .droplevel(0, axis=1)
      .reset_index()
      .rename_axis('', axis=1))
df
Out[1]: 
   Name Phrase 1      Phrase 2             Phrase 3
0  Brad   Hello!    I am good!         How are you?
1  John      Hi!  How are you?  Is everything okay?
  • Related