I need to apply StandardScaler
of sklearn
to a single column col1
of a DataFrame:
df
:
col1 col2 col3
1 0 A
1 10 C
2 1 A
3 20 B
This is how I did it:
from sklearn.preprocessing import StandardScaler
def listOfLists(lst):
return [[el] for el in lst]
def flatten(t):
return [item for sublist in t for item in sublist]
scaler = StandardScaler()
df['col1'] = flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist())))
However, then I apply the inverse_transform
, then it does not give me initial values of col1
. Instead it returns the normalised values:
scaler.inverse_transform(flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist()))))
or:
scaler.inverse_transform(df['col1'])
CodePudding user response:
You could fit a scaler directly on the column (since the scaler is expecting a 2D array, you can select the column as a DataFrame by df[['col1']]
):
scaler = StandardScaler()
>>> arr = scaler.fit_transform(df[['col1']]).flatten()
array([-0.90453403, -0.90453403, 0.30151134, 1.50755672])
>>> scaler.inverse_transform(arr)
array([1., 1., 2., 3.])