Home > front end >  Error when manipulating dataframe with columns of type string with Pandas Pivot Table
Error when manipulating dataframe with columns of type string with Pandas Pivot Table

Time:07-15

I have the dataframe:

enter image description here

And I would like to obtain using Pivot Table or an alternative function this result:

enter image description here

I am trying to transform the rows of the Custom Field column into Columns, with the Pivot Table function of Pandas, and I get an error:

import pandas as pd

data = {
"Custom Field": ["CF1", "CF2", "CF3"],
"id": ["RSA", "RSB", "RSC"],
"Name": ["Wilson", "Junior", "Otavio"]
}

### create the dataframe ###
df = pd.DataFrame(data)

print(df)

df2 = df.pivot_table(columns=['Custom Field'], index=['Name'])
print(df2)

enter image description here

I suspect it is because I am working with Strings.

Any suggestions?

Thanks in advance.

CodePudding user response:

You need pivot, not pivot_table. The latter does aggregation on possibly repeating values whereas the former is just a rearrangement of the values and fails for duplicate values.

df.pivot(columns=['Custom Field'], index=['Name'])

Update as per comment: if there are multiple values per cell, you need to use privot_table and specify an appropriate aggregate function, e.g. concatenate the string values. You can also specify a fill value for empty cells (instead of NaN):

df = pd.DataFrame({"Custom Field": ["CF1", "CF2", "CF3", "CF1"],
                   "id": ["RSA", "RSB", "RSC", "RSD"],
                   "Name": ["Wilson", "Junior", "Otavio", "Wilson"]})

df.pivot_table(columns=['Custom Field'], index=['Name'], aggfunc=','.join, fill_value='-')

                    id          
Custom Field       CF1  CF2  CF3
Name                            
Junior               -  RSB    -
Otavio               -    -  RSC
Wilson         RSA,RSD    -    -
  • Related