Home > Mobile >  Pandas Dataframe: Split a single column into multiple columns
Pandas Dataframe: Split a single column into multiple columns

Time:11-11

I'm trying to split a column Class into multiple columns and change column names based on that.

    ID    Name    Class       

  0    12    John      A
  1    13    Mark      A
  2    14    Tony      B
  3    15    Marcus    C 
  4    16    Phill     D
  5    17    Jack      A

final df

    ID    Name    Class     A         B       C       D  

0    12    John      A       A
1    13    Mark      A       A
2    14    Tony      B                 B
3    15    Marcus    C                         C
4    16    Phill     D                                D
5    17    Jack      A       A

CodePudding user response:

A potentially slow way of doing this would be to define a function and then loop over all possible answers for each item in the original column.

#define a function to see if matched value
def new_column_val(row, value, column):
    
    if row[column] == value:
       return value
    else:
      return None

#create the new columns
for class_name in df["class"].unique():

    df[class] = df.apply(new_column_val, args = (class_name, "class")

CodePudding user response:

you can use get_dummies:

mask=pd.get_dummies(df.Class).replace(1,np.nan)
for col in mask.columns:
    mask[col].fillna(col, inplace=True)

final=df.join(mask.replace(0,np.nan))
final
    ID    Name    Class     A         B       C       D  

0    12    John      A       A
1    13    Mark      A       A
2    14    Tony      B                 B
3    15    Marcus    C                         C
4    16    Phill     D                                D
5    17    Jack      A       A

CodePudding user response:

import numpy as np
uniq_class = df['Class'].unique().tolist()
# create a diagonal matrix with unique class as value
D = np.diag(uniq_class).tolist()
# map the diagonal matrix dictionary for each class value
temp = dict(zip(uniq_class, D))
# map class values to the temp dictionary
df[uniq_class] = df['Class'].map(temp).tolist()
df

Output:

   ID    Name Class  A  B  C  D
0  12    John     A  A         
1  13    Mark     A  A         
2  14    Tony     B     B      
3  15  Marcus     C        C   
4  16   Phill     D           D
5  17    Jack     A  A    
  • Related