Home > Enterprise >  Get OOB score within a pipeline for Random Forest
Get OOB score within a pipeline for Random Forest

Time:01-05

I was wondering for a machine learning project: is it possible to implement RandomForestRegressor inside a pipeline?

Specifically, I need to determine the OOB score from a RandomForestRegressor. But my data requires a lot of preprocessing.

I tried several things, and this is the closest so far:

# Creation of the pipeline 

rand_piped = Pipeline([
    ('preprocessor', preprocessor),
    ('model', RandomForestRegressor(max_depth=3, random_state=0, oob_score=True))
    ])

# Fitting our model

rand_piped.fit(df_X_train,df_Y_train.values.ravel())

# Getting our metrics and predictions 

oob_score = rand_piped.oob_score_

At the moment I think my problem is that I still have an unclear idea of this method. So feel free to correct me. It returns this error:

Traceback (most recent call last):
  File "/home/user/my_rf.py", line 15, in <module>
    oob_score = rand_piped.oob_score_
AttributeError: 'Pipeline' object has no attribute 'oob_score_'

CodePudding user response:

Pipelines are subscriptable, so you can look up the oob_score_ in the model step:

>>> rand_piped["model"].oob_score_
0.9297212997034854
  • Related