Home > Mobile >  AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_' w
AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_' w

Time:06-22

I am trying to store several estimators in a pandas DataFrame, and I keep running into this error:

AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_'

Initially, I though this was due to the fact that it was trying to copy the estimator to several rows, however, I was able to replicate the error with the following code:

pd.DataFrame({
    "foo" : "bar",
    "model" : RandomForestClassifier()
})

I also tried saving the estimator class in a dictionary and instantiating it in the dataFrame as seen below:

d = {"rf" : RandomForestClassifier}
pd.DataFrame({
    "foo" : "bar",
    "model" : d["rf"](random_state=100)
})

yet I still get the same error. So I'm thinking, if there is a solution for doing it as a single entry, then I'll be able to sclae that up. Does anyone have any ideas?

CodePudding user response:

The problem is that pandas is trying to explode the values of your dictionary into values for multiple rows, for which it checks the len of each, and RandomForestClassifier defines a __len__ method, as the number of fitted estimators (i.e. len(estimators_)).

In your one-row case, you can just wrap everything as singleton lists:

pd.DataFrame({
    "foo": ["bar"],
    "model": [RandomForestClassifier()],
})

CodePudding user response:

This is really bizzare, it has to have something to do with the way a pandas DF instantiates. As a work around, I dont get the same error when using pd.Series instead...Which you could then turn into a DF

ser = pd.Series({
    "foo" : "bar",
    "model" : RandomForestClassifier(),
})
df = pd.DataFrame(ser)
  • Related