Home > Enterprise >  Visualization data train and data test from train_test_split with seaborn
Visualization data train and data test from train_test_split with seaborn

Time:12-22

I've a data which has row 9583 and I split it with train_test_split. I want to visualization my data train and data test using barplot like this example:

example barplot

import pandas as pd

df = pd.read_excel("Data/data_clean_spacy_for_implementation.xlsx")

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    df["text"], df["label"], test_size=0.2, stratify=df["label"], random_state=42)

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer()

X_train = vectorizer.fit_transform(X_train)
X_test = vectorizer.transform(X_test)
X_array = X_train.toarray()

print(X_train.shape) #output (7666, 12222)
print(X_test.shape) #output (1917, 12222)

How to do that?

My data train_test_label_count

  • Related