Home > Enterprise >  I can not understand why my test and predict y plot for my regression model is like that?
I can not understand why my test and predict y plot for my regression model is like that?

Time:11-22

I am working on a regression model (Decision Tree) on a multidimensional data, with 16 features. The model r2_score is 0.97. The y test and y predict plot looks so wrong! the range of x is not the same.

plot

would you please tell me what is the problem?

I have also tried to fit the model in one dimension to check the x range in the diagram, but it just decrease the score obviously, and the diagram is still odd!

CodePudding user response:

Matplotlib's plot function draws a single line by connecting the points in the order that they are drawn. The reason you are seeing a mess is because the points are not ordered along the x-axis.

In a regression model, you have a function f(x) -> R where f here is your decision tree and x is in the 16 dimensional space. However, you cannot order your x , which has 16 dimensions, along the x-axis.

Instead, what you can do is just plot the the ground truth and predicted values for each index as a scatter plot:

import numpy as np

# Here, I'm assuming y_DT_5 is either a 1D array or a column vector.
# If necessary, change the argument of np.arange accordingly to get the number of values
idxs = np.arange(len(y_DT_5))

plt.figure(figsize=(16,4))
plt.scatter(x=idxs, y=y_DT_5, marker='x') # Plot each ground truth value as separate pts
plt.scatter(x=idxs, y=y_test, marker='.') # Plot each predicted value as separate points

If your model works, the 2 points plotted at each index should be close along the y-axis.

  • Related