Home > database >  How to transpose DataFrame column to a zero mean and one standard deviation
How to transpose DataFrame column to a zero mean and one standard deviation

Time:05-28

I am trying to adjust some columns to have a mean of zero and one SD. But I am not sure how to do that.

E.g. given the following dataframe, how do you create a new column with mean 0 and sd 1?

df = pd.DataFrame([8.2,18,15,9], columns=['temp'])

Here is something I have tried with Standard Scaler

from sklearn.preprocessing import StandardScaler

df = pd.DataFrame([[8.2,57],[18,60],[15,45],[9,30]], columns=['temp','rh'])
print(df)
scaler = StandardScaler(copy=False, with_mean=True, with_std=True)
scaler.fit(df)
print(f"Means: {scaler.mean_}")
df2 = scaler.transform(df)
print(f"Transformed Data Frame:\n{df2}")
m = np.mean(df2, axis=0)
s = np.std(df2, axis=0)
print(f"Column  means:\n{m}")
print(f"Column  SD:\n{s}")

But the results are not a mean of zero or sd=1 at all.

   temp  rh
0   8.2  57
1  18.0  60
2  15.0  45
3   9.0  30
Means: [12.55 48.  ]
Transformed Data Frame:
[[-1.06105451  0.76200076]
 [ 1.32936715  1.01600102]
 [ 0.59760542 -0.25400025]
 [-0.86591805 -1.52400152]]
Column  means:
[-2.49800181e-16  0.00000000e 00]
Column  SD:
[1. 1.]

CodePudding user response:

from sklearn.preprocessing import StandardScaler
df1 = StandardScaler().fit_transform(df)

Will do the trick.

  • Related