Home > OS >  Contour plot for multi linear regression model
Contour plot for multi linear regression model

Time:11-08

I have to obtain contour plots to get a range of optimum values using the following variables:

X axis = SiO2/Al2O3
Y axis = Precursor/Aggregate
Z axis = Compressive Strength

My code is the following

import numpy as np
import matplotlib as mlt
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_predict = regressor.predict(X_test)

feature_x = X_test[:, 1]
feature_y = X_test[:, 3]

[X, Y] = np.meshgrid(feature_x, feature_y)  
Z = y_predict    

ax.contourf(X, Y, Z)  
ax.set_title('Filled Contour Plot')
ax.set_xlabel('SiO2/Al2O3')
ax.set_ylabel('Precursor/Aggregate')
plt.show()

but it gives this error

TypeError: Input z must be 2D, not 1D

I think I'm making a mistake in the Z axis input.

The data is available enter image description here

CodePudding user response:

Your code will not work, you need to create a grid for your predictor values, first we read in your data and fit:

dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

regressor = LinearRegression()
regressor.fit(X_train, y_train)

Then you need to create a grid for the features you are interested in:

feature_x = np.linspace(X_test[:, 1].min(),X_test[:, 1].max(),100)
feature_y = np.linspace(X_test[:, 3].min(),X_test[:, 3].max(),100)

Meshgrid:

dim1, dim2 = np.meshgrid(feature_x, feature_y)

Now your model has 6 other predictors you need to provide to fit. One way is to hold these other variables at their mean, and then we slot in the mesh grid:

mesh_df = np.array([X_test.mean(axis=0) for i in range(dim1.size)])
mesh_df[:,1] = dim1.ravel()
mesh_df[:,2] = dim2.ravel()

Now predict, reshape, and plot:

Z = regressor.predict(mesh_df).reshape(dim1.shape)  

fig, ax = plt.subplots()

ax.contourf(dim1, dim2, Z)  
ax.set_title('Filled Contour Plot')
ax.set_xlabel('SiO2/Al2O3')
ax.set_ylabel('Precursor/Aggregate')
plt.show()

Looks like this because you are using a linear regression, the values will increase or decrease linearly with your variable:

enter image description here

  • Related