Home > database >  Pivot column in Pandas
Pivot column in Pandas

Time:04-19

I have 4 columns in a dataframe, they are: id external_id value description

Exemple table:

ID external_id value description
1 65 Pendent Account Onboarding All iN
1 65 Lais Gestor
2 93 Maria Account Onboarding All iN
3 454 Renan Gestor
4 535 Osiris Modelo de negocio
5 999 Togo Account Onboarding All iN
6 25 João Gestor
7 85 Lima Account CS SM
8 22 Teixeira Account Onboarding SM

I need to turn the descripton lines into columns and the value would be its respective value.

In general, I need to create a filter that creates the columns below, when it matches description:

Account Onboarding All iN Account Onboarding SM Account CS All iN" Account CS SM Account Operacional All iN Gestor Natureza Modelo de negocio Vertical

Each description above needs to become a column and if it exists the record will come from the value column.

Expected output:

id external_id Account Onboarding All iN Account Onboarding SM Account CS All iN Account CS SM Account Operacional All iN Gestor Natureza Modelo de negocio Vertical
1 65 Pendent None None None None Lais None None None
2 93 Maria None None None None None None None None
3 454 None None None None None Renan None None None
4 535 None None None None None None None Osiris None
5 999 Togo None None None None None None None None
6 25 None None None None None João None None None
7 85 None None None Lima None None None None None
8 22 None Teixeira None None None None None None None

One note is that if the ID is the same, the results should be on the same line.

How can I do this?

CodePudding user response:

With empty strings

dummies = pd.get_dummies(df.description)
df.drop(['value', 'description'], axis=1).join(dummies.mul(df.value, axis=0))

   ID  external_id Account CS SM Account Onboarding All iN Account Onboarding SM Gestor Modelo de negocio
0   1           65                                 Pendent                                               
1   1           65                                                                 Lais                  
2   2           93                                   Maria                                               
3   3          454                                                                Renan                  
4   4          535                                                                                 Osiris
5   5          999                                    Togo                                               
6   6           25                                                                 João                  
7   7           85          Lima                                                                         
8   8           22                                                      Teixeira                         

With nan

dummies = pd.get_dummies(df.description).eq(1)
tojoin = dummies.mask(dummies, df.value, axis=0).where(dummies)
df.drop(['value', 'description'], axis=1).join(tojoin)

   ID  external_id Account CS SM Account Onboarding All iN Account Onboarding SM Gestor Modelo de negocio
0   1           65           NaN                   Pendent                   NaN    NaN               NaN
1   1           65           NaN                       NaN                   NaN   Lais               NaN
2   2           93           NaN                     Maria                   NaN    NaN               NaN
3   3          454           NaN                       NaN                   NaN  Renan               NaN
4   4          535           NaN                       NaN                   NaN    NaN            Osiris
5   5          999           NaN                      Togo                   NaN    NaN               NaN
6   6           25           NaN                       NaN                   NaN   João               NaN
7   7           85          Lima                       NaN                   NaN    NaN               NaN
8   8           22           NaN                       NaN              Teixeira    NaN               NaN

A little more surgical

j, c = pd.factorize(df.description)
n, m = df.shape[0], c.shape[0]
i = np.arange(n)
a = np.full((n, m), None)
a[i, j] = df.value
tojoin = pd.DataFrame(a, df.index, c)
df.drop(['value', 'description'], axis=1).join(tojoin)

   ID  external_id Account Onboarding All iN Gestor Modelo de negocio Account CS SM Account Onboarding SM
0   1           65                   Pendent   None              None          None                  None
1   1           65                      None   Lais              None          None                  None
2   2           93                     Maria   None              None          None                  None
3   3          454                      None  Renan              None          None                  None
4   4          535                      None   None            Osiris          None                  None
5   5          999                      Togo   None              None          None                  None
6   6           25                      None   João              None          None                  None
7   7           85                      None   None              None          Lima                  None
8   8           22                      None   None              None          None              Teixeira

CodePudding user response:

you can use pd.pivot() for it:

res = (df.pivot(index='external_id',columns='description',values='value')
       .reset_index()
       .rename_axis(columns=None, index='id')
      )

print(res)

    external_id Account CS SM Account Onboarding All iN Account Onboarding SM  \
id                                                                              
0            22           NaN                       NaN              Teixeira   
1            25           NaN                       NaN                   NaN   
2            65           NaN                   Pendent                   NaN   
3            85          Lima                       NaN                   NaN   
4            93           NaN                     Maria                   NaN   
5           454           NaN                       NaN                   NaN   
6           535           NaN                       NaN                   NaN   
7           999           NaN                      Togo                   NaN   

   Gestor Modelo de negocio  
id                           
0     NaN               NaN  
1    João               NaN  
2    Lais               NaN  
3     NaN               NaN  
4     NaN               NaN  
5   Renan               NaN  
6     NaN            Osiris  
7     NaN               NaN  

  • Related