ValueError: X has 2 features, but MinMaxScaler is expecting 1 features as input-CodePudding

I have numpy arrays split into X and y, originally made from Pandas DataFrame as follows:

>> X
array([[ 2.86556780e-03,  1.87100798e-01],
   [ 2.56781670e-04,  2.45417491e-01],
   [ 2.35497137e-03,  1.76615342e-01],
   ...,
   [ 2.30078468e-03, -4.16726811e-60],
   [ 5.66213972e-03, -2.98597808e-60],
   [ 4.39503905e-03, -2.13954678e-60]])

>> y
array([19.08666992, 19.09239006, 19.08938026, ..., 45.21157634,
   45.19350761, 45.13230675])

I split them into training and test dataset as follows:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Before scaling the data, I reshape my labels as follows:

y_train= y_train.reshape((-1,1))
y_test= y_test.reshape((-1,1))

Using sklearn MinMaxScaler I then fit_transform my training_data as follows:

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
y_train = scaler.fit_transform(y_train)

I then try to transform my test data using MinMaxScaler as follows:

X_test = scaler.transform(X_test)
y_test = scaler.transform(y_test)

But test dataset is not transformed as I get the following error:

----> 1 X_test = scaler.transform(X_test)

ValueError: X has 2 features, but MinMaxScaler is expecting 1 features as input.

Can anyone guide me what I am doing wrong here.

CodePudding user response：

This is because scaler is fit to y_train which has a single feature, whereas X_test has 2 features.

You have to define different scaler objects for X and y:

scaler_X = MinMaxScaler()
scaler_Y = MinMaxScaler()
X_train = scaler_X.fit_transform(X_train)
y_train = scaler_Y.fit_transform(y_train)
X_test = scaler_X.transform(X_test)
y_test = scaler_Y.transform(y_test)

another way to do the same job is to use a scaler fit to X_train to transform X_test; then use a scaler fit to y_train to transform y_test:

scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_train = scaler.fit_transform(y_train)
y_test = scaler.transform(y_test)