Home > Back-end >  How to plot a pairplot with hue after splitting the dataset
How to plot a pairplot with hue after splitting the dataset

Time:10-18

Im having a classification problem with iris dataset,i can create a pairplot on the raw dataset which looks like this when hue='species'

enter image description here

But How can i use hue after splitting the dataset into X_train,y_train as the species class is being separated ?

X = DATA.drop(['class'], axis = 'columns')
y = DATA['class'].values

X_train, X_test, y_train, y_test=train_test_split(X,y, test_size=0.20,random_state =42)

gbl_pl=[]
gbl_pl.append(('standard_scaler_gb',
StandardScaler(copy=cpystadscl, with_mean=wthmenstadscl, with_std=withstdscl)))
               
      
gblpq=Pipeline((gbl_pl))

scaled_df=gblpq.fit_transform(X_train,y_train)

sns.pairplot(data=scaled_df)
plt.show()

output

enter image description here

Expectation (Something like this with the split dataset excluding the test data) enter image description here

CodePudding user response:

You could concatenate y_train as a column to X_train.

from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
import seaborn as sns
import pandas as pd
import numpy as np

iris = sns.load_dataset('iris')
X = iris.drop(columns='species')
y = iris['species']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

sns.pairplot(data=pd.concat([X_train, y_train], axis=1), hue=y_train.name)

sns.pairplot of concatenated X_trains and y_train

  • Related