i have KNN model pickled with StandartScaler.
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
when im trying to load model and pass new values via StandartScaler().transform() it give me an error:
sklearn.exceptions.NotFittedError: This StandardScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
im trying to load values from dictionary
dic = {'a':1, 'b':32323, 'c':12}
sc = StandartScaler()
load = pickle.load(open('KNN.mod'), 'rb'))
load.predict(sc.transform([[dic['a'], dic['b'], dic['c']]]))
as far i understand from error i have to fit new data to sc. but if do so it gives me wrong predictions. im not sure my im overfiting or smth, do random forest and decision tree works fine with that data without sc. logistic regresion semi ok
CodePudding user response:
You need to train and pickle the entire machine learning pipeline at the same time. This can be done with the Pipeline tool from sklearn. In your case it will look like:
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.neighbors import NearestNeighbors
pipeline = Pipeline([('scaler', StandardScaler()), ('knn', NearestNeighbors())])
pipe.fit(X_train, y_train)
# save the ml pipeline
pickle.dump(pipeline, open('KNN_pipeline.pkl'), 'wb'))
# load the ml pipeline and do prediction
pipeline = pickle.load(open('KNN_pipeline.pkl'), 'rb'))
pipeline.predict(X_test)