I'm trying to plot a decision tree but I get this error:
'Pipeline' object has no attribute 'tree_'
At first I build my model from a preprocessor (data types int
and object
):
preprocessor = ColumnTransformer([
('one-hot-encoder', categorical_preprocessor, categorical_columns),
('standard_scaler', numerical_preprocessor, numerical_columns)])
model3 = make_pipeline(preprocessor, DecisionTreeClassifier())
Then I fit the model and generate the predictions:
model3 = model3.fit(data_train, target_train)
y_pred3 = model3.predict(data_test)
After that I try to plot the tree:
tree.plot_tree(model3)
but I get the error:
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_22012/3111274197.py in <module>
----> 1 tree.plot_tree(model3)
~\anaconda3\lib\site-packages\sklearn\tree\_export.py in plot_tree(decision_tree, max_depth, feature_names, class_names, label, filled, impurity, node_ids, proportion, rounded, precision, ax, fontsize)
193 fontsize=fontsize,
194 )
--> 195 return exporter.export(decision_tree, ax=ax)
196
197
~\anaconda3\lib\site-packages\sklearn\tree\_export.py in export(self, decision_tree, ax)
654 ax.clear()
655 ax.set_axis_off()
--> 656 my_tree = self._make_tree(0, decision_tree.tree_, decision_tree.criterion)
657 draw_tree = buchheim(my_tree)
658
AttributeError: 'Pipeline' object has no attribute 'tree_'
How can I plot my tree? Or is this impossible because I use a pipeline?
CodePudding user response:
As stated in comments, you should access the DecisionTreeClassifier
instance in your pipeline to be able to plot the tree, which you can do as follows:
plot_tree(model3.named_steps['decisiontreeclassifier'])
named_steps
being a property of the Pipeline
allowing to access the pipeline's steps by name and 'decisiontreeclassifier'
being the step name implied by make_pipeline
:
This is a shorthand for the Pipeline constructor; it does not require, and does not permit, naming the estimators. Instead, their names will be set to the lowercase of their types automatically.