Home > OS >  ValueError: Expected a 2D array, but received a 1D array instead:
ValueError: Expected a 2D array, but received a 1D array instead:

Time:08-31

I received this error while practicing the Simple Linear Regression Model; I assume there is an issue with my set of data. Here is the Error ValueError: Expected 2D array, got 1D array instead: array=[1140. 1635. 1755. 1354. 1978. 1696. 1212. 2736. 1055. 2839. 2325. 1688. 2733. 2332. 2159. 2133.].

Here is my Dataset

Here the code

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
df = pd.read_csv('C:/Users/AgroMech/Desktop/ASDS/data.csv')
df.shape
print(df.duplicated())
df.isnull().any()
df.isnull().sum()
df.dropna(inplace = True)
x=df["Area"]
y=df["Price"]
df.describe()
reg = linear_model.LinearRegression()
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=4)
x_train.head()
reg=LinearRegression() 
reg.fit(x_train,y_train)
LinearRegression(copy_x=True, fit_intercept=True, n_jobs=1, normalize=False)
reg.coef_
reg.predict(x_test)
np.mean((reg.predict(x_test) - y_test)**2)

CodePudding user response:

The easiest way to reshape your x variable (from a 1D array to a 2D) is:

x = df[["Area"]]

CodePudding user response:

As the error suggests when executing reg.fit(x_train, y_train):

ValueError: Expected 2D array, got 1D array instead:
array=[1140. 1635. 1755. 1354. 1978. 1696. 1212. 2736. 1055. 2839. 2325. 1688.
 2733. 2332. 2159. 2133.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

This means your arrays don't have the right shape for reg.fit(). You can reshape them explicitly:

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=4)

x_train = x_train.values.reshape(-1,1)
x_test = x_test.values.reshape(-1,1)
y_train = y_train.values.reshape(-1,1)
y_test = y_test.values.reshape(-1,1)

or you can reshape your original x and y values:

x = df[['Area']]
y = df[['Price']]

Also note that LinearRegression takes a copy_X argument and not copy_x.

  • Related