I am using the following code and please see the error below. the code:
please see my data in this link
import pandas as pd
df=pd.read_csv('Ecommerce Customers')
df.head()
X=[['Avg. Session Length', 'Time on App','Time on Website']]
y=['Yearly Amount Spent']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.25, random_state=101)
output:
ValueError: With n_samples=1, test_size=0.25 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters.
I've read in the forum to extend the data in this case but I have no idea how to do it.
CodePudding user response:
You need to add 'df' when creating X and y:
X=df[['Avg. Session Length', 'Time on App','Time on Website']]
y=df['Yearly Amount Spent']
This way X and y are Dataframes/Series. Otherwise they are just lists with column names in it.
CodePudding user response:
To fix this error, you should select the columns you want as features from the dataframe, and then assign them to X.
For example:
X = df[['Avg. Session Length', 'Time on App','Time on Website']]
y = df['Yearly Amount Spent']
This should give you the correct format for the inputs to train_test_split.