Home > database >  Make some transformation to a column and then set it as index
Make some transformation to a column and then set it as index

Time:10-19

Let say I have below pandas dataframe

import pandas as pd
dat = pd.DataFrame({'A' : [1,2,3,4], 'B' : [3,4,5,6]})
dat['A1'] = dat['A'].astype(str)   '_Something'
dat.set_index('A1')

While this alright, I want to achieve below things

  1. Instead of having this line dat['A1'] = dat['A'].astype(str) '_Something', can I transform the column A on the fly and directly pass that transformed values to dat.set_index? My transformation function is rather little complex, so I am looking for some general approach
  2. After setting index, can I remove A1 which is now sitting as like the header of index

Any pointer will be very helpful

CodePudding user response:

You can pass a np.array to df.set_index. So, just chain Series.to_numpy after the transformation, and make sure that you set the inplace parameter to True inside set_index.

dat.set_index(
    (dat['A'].astype(str)   '_Something') # transformation
    .to_numpy(), 
    inplace=True)

print(dat)

             A  B
1_Something  1  3
2_Something  2  4
3_Something  3  5
4_Something  4  6

So, generalized with a function applied, that would be something like:

def f(x):
    y = f'{x}_Something'
    return y
    
dat.set_index(dat['A'].apply(f).to_numpy(), inplace=True)
  • Related