Home > Blockchain >  Distributing the values of one column to multiple columns in Pandas
Distributing the values of one column to multiple columns in Pandas

Time:06-12

I have the following DataFrame:

df = pd.DataFrame([['A',1,10],
                   ['A',1,20],
                   ['C',2,15],
                   ['B',3,20]], columns=['Val1','Val2','Val3'])
colList = [f'{col}_{i}' for col in ['A', 'B', 'C'] for i in range(5)]
data = np.zeros(shape=(len(df), len(colList)))
df = pd.concat([df, pd.DataFrame(data, columns=colList)], axis=1)
df['REF'] = df['Val1']   '_'  df['Val2'].astype(str)


  Val1  Val2  Val3  A_0  A_1  A_2  A_3  ...  B_4  C_0  C_1  C_2  C_3  C_4  REF
0    A     1    10  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  A_1
1    A     1    20  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  A_1
2    C     2    15  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  C_2
3    B     3    20  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  B_3

Column Val3 contains values which should be assigned to the corresponding columns under the REF columns. This can be done by using the following lambda function:

def func(x):
    x[x['REF']] = x['Val3']
    return x

df = df.apply(lambda x: func(x), axis=1)
print(df)


  Val1  Val2  Val3  A_0   A_1  A_2  A_3  ...  B_4  C_0  C_1   C_2  C_3  C_4  REF
0    A     1    10  0.0  10.0  0.0  0.0  ...  0.0  0.0  0.0   0.0  0.0  0.0  A_1
1    A     1    20  0.0  20.0  0.0  0.0  ...  0.0  0.0  0.0   0.0  0.0  0.0  A_1
2    C     2    15  0.0   0.0  0.0  0.0  ...  0.0  0.0  0.0  15.0  0.0  0.0  C_2
3    B     3    20  0.0   0.0  0.0  0.0  ...  0.0  0.0  0.0   0.0  0.0  0.0  B_3

Is there any way I can vectorize this process without using a Lambda function in order to speed up the process?

CodePudding user response:

Pivot and update

df.update(df.pivot(columns='REF', values='Val3'))

>>> df

  Val1  Val2  Val3  A_0   A_1  A_2  A_3  A_4  B_0  B_1  B_2   B_3  B_4  C_0  C_1   C_2  C_3  C_4  REF
0    A     1    10  0.0  10.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0   0.0  0.0  0.0  A_1
1    A     1    20  0.0  20.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0   0.0  0.0  0.0  A_1
2    C     2    15  0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0  15.0  0.0  0.0  C_2
3    B     3    20  0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.0  20.0  0.0  0.0  0.0   0.0  0.0  0.0  B_3
  • Related