Reformatting a dataframe to replace repeating similar rows with a new column-CodePudding

Input.

Name	Phrase number	Words said
John	Phrase 1	Hi!
John	Phrase 2	How are you?
John	Phrase 3	Is everything okay?
Brad	Phrase 1	Hello!
Brad	Phrase 2	I am good!
Brad	Phrase 3	How are you?

Desired output.

Name	Phrase 1	Phrase 2	Phrase 3
John	Hi!	How are you?	Is everything okay?
Brad	Hello!	I am good!	How are you?

How would you solve this with Pandas?

CodePudding user response：

you can use pivot but have to use a few other methods to clean up the index and columns names (in order to exactly match the desired output):

df = (df.pivot(index='Name', columns='Phrase number')
      .droplevel(0, axis=1)
      .reset_index()
      .rename_axis('', axis=1))
df
Out[1]: 
   Name Phrase 1      Phrase 2             Phrase 3
0  Brad   Hello!    I am good!         How are you?
1  John      Hi!  How are you?  Is everything okay?