I am intending to visualise the data using a pairplot after using StandardScaler, But my code is producing the following error
raise TypeError(msg)
TypeError: cannot concatenate object of type '<class 'numpy.ndarray'>'; only Series and DataFrame objs are valid
Full code
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
import seaborn as sns
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
iris = sns.load_dataset('iris')
X = iris.drop(columns='species')
y = iris['species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
X_train=StandardScaler().fit_transform(X_train)
sns.pairplot(data=pd.concat([X_train, y_train], axis=1), hue=y_train.name)
plt.show()
CodePudding user response:
After using StandardScaler
, your X_train (which was a pd.DataFrame
before) has become a numpy.ndarray
, so that's why you cannot concat X_train
and y_train
. Because X_train
is a NumPy array and y_train
is a Pandas DataFrame
To use concat, both X_train
and y_train
has to be a Pandas DataFrame, so convert X_train
to a DataFrame using this code.
X_train = StandardScaler().fit_transform(X_train)
X_train = pd.DataFrame(X_train, columns = X.columns)
sns.pairplot(data=pd.concat([X_train, y_train], axis=1), hue=y_train.name)
plt.show()
It will work.