I have some old sklearn models which I can't retrain. They were pickled long time ago with unclear versions. I can open them with Python 3.6 and Numpy 1.14. But when I try to move to Python 3.8 with Numpy 1.18, I get a segfault on loading them.
I tried dumping them with protocol 4 from Python 3.6, it didn't help.
Saving:
with open('model.pkl', 'wb') as fid:
pickle.dump(model, fid, protocol=4)
Loading:
model = pickle.load(open('model.pkl', "rb"))
Is there anything I can do in such situation?
CodePudding user response:
What worked for me (very task-specific but maybe will help someone):
Old dependencies:
import joblib
model = pickle.load(open('model.pkl', "rb"), encoding="latin1")
joblib.dump(model.tree_.get_arrays()[0], "training_data.pkl")
Newer dependencies:
import joblib
from sklearn.neighbors import KernelDensity
data = joblib.load("training_data.pkl")
kde = KernelDensity(
algorithm="auto",
atol=0,
bandwidth=0.5,
breadth_first=True,
kernel="gaussian",
leaf_size=40,
metric="euclidean",
metric_params=None,
rtol=0
).fit(data)
with open("new_model.pkl", "wb") as f:
pickle.dump(kde, f)