I am building Random forest algorithm, the goal is to predict which features are more important. And I have Bar graph showing features importance from Random Forest Built-in Feature Importance. Is there a chance to filter out data that are relatively smaller compared to larger bars and how to implement this. I want to do these because there is a mess on these picture below:
Input code:
rf = RandomForestRegressor(n_estimators=100, max_depth=3)
rf.fit(X_train, y_train)
sorted_idx = rf.feature_importances_.argsort()
plt.figure(figsize=(8, 30))
plt.barh(X_train.columns[sorted_idx], rf.feature_importances_[sorted_idx])
plt.xlabel("Random Forest Feature Importance")
CodePudding user response:
By filtering the sorted_idx
variable, you should be able to do so:
sorted_idx = rf.feature_importances_.argsort()[:5]
Instead of 5, you can obviously take whatever number of features you want to be plotted.