How to apply StandardScalar to a single column?-CodePudding

I need to apply StandardScaler of sklearn to a single column col1 of a DataFrame:

df:

col1  col2  col3
1     0     A
1     10    C
2     1     A
3     20    B

This is how I did it:

from sklearn.preprocessing import StandardScaler

def listOfLists(lst):
    return [[el] for el in lst]

def flatten(t):
    return [item for sublist in t for item in sublist]

scaler = StandardScaler()

df['col1'] = flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist())))

However, then I apply the inverse_transform, then it does not give me initial values of col1. Instead it returns the normalised values:

scaler.inverse_transform(flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist()))))

or:

scaler.inverse_transform(df['col1'])

CodePudding user response：

You could fit a scaler directly on the column (since the scaler is expecting a 2D array, you can select the column as a DataFrame by df[['col1']]):

scaler = StandardScaler()
>>> arr = scaler.fit_transform(df[['col1']]).flatten()
array([-0.90453403, -0.90453403,  0.30151134,  1.50755672])

>>> scaler.inverse_transform(arr)
array([1., 1., 2., 3.])