Im having a classification problem with iris dataset,i can create a pairplot on the raw dataset which looks like this when hue='species'
But How can i use hue
after splitting the dataset into X_train,y_train as the species class is being separated ?
X = DATA.drop(['class'], axis = 'columns')
y = DATA['class'].values
X_train, X_test, y_train, y_test=train_test_split(X,y, test_size=0.20,random_state =42)
gbl_pl=[]
gbl_pl.append(('standard_scaler_gb',
StandardScaler(copy=cpystadscl, with_mean=wthmenstadscl, with_std=withstdscl)))
gblpq=Pipeline((gbl_pl))
scaled_df=gblpq.fit_transform(X_train,y_train)
sns.pairplot(data=scaled_df)
plt.show()
output
Expectation (Something like this with the split dataset excluding the test data)
CodePudding user response:
You could concatenate y_train as a column to X_train.
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
import seaborn as sns
import pandas as pd
import numpy as np
iris = sns.load_dataset('iris')
X = iris.drop(columns='species')
y = iris['species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
sns.pairplot(data=pd.concat([X_train, y_train], axis=1), hue=y_train.name)