I tried searching for the answer, but I dont know why it is expecting 4...so I can't find the solution. I am trying to make a script that will iterate over the models and then create a plot based on their performance with regards to the iris dataset and the feature scalings I applied. currently this is the segment of the code in which I am getting the error.
Code:
models = {
"Logistic Regression": LogisticRegression(),
"Decision Tree": DecisionTreeClassifier(max_leaf_nodes=3),
"Random Forest": RandomForestClassifier(max_depth=3),
"svm_model" : SVC(kernel='linear')
}
def evaluate_model(model, dataset):
x_train, x_test, y_train, y_test = data
model.fit(x_train, y_train)
pred = model.predict(x_test)
return accuracy_score(pred, y_test)
for model_name, model in models.items():
model_score = evaluate_model(model, dataset)
#dataset_scores[model_name] = model_score
# model_scores_for_datasets[dataset_name] = dataset_scores
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-531-35544d14a669> in <module>
7
8 for model_name, model in models.items():
----> 9 model_score = evaluate_model(model, dataset)
10 #dataset_scores[model_name] = model_score
11
<ipython-input-521-f26300591060> in evaluate_model(model, dataset)
13
14 def evaluate_model(model, dataset):
---> 15 x_train, x_test, y_train, y_test = data
16 model.fit(x_train, y_train)
17 pred = model.predict(x_test)
ValueError: not enough values to unpack (expected 4, got 1)
CodePudding user response:
From your question, it is not clear what the type or shape of your data variable is. Did you properly split your data so that you will return test and train splits?
For example let's say you have some paired data:
x = np.arange(1, 25).reshape(12, 2)
y = np.array([0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0])
You need to use train_test_split from sklearn to split it as:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y)
But your data variable includes only one iterable item, so fails to unpack it into 4 variables x_train, x_test, y_train and y_test. You may basically generate the same error as:
x_train, x_test, y_train, y_test = [1]
Also your evaluate_model function does not use the dataset input, but uses a global data to unpack. Thus replace that line to make the function work as it was intended.
def evaluate_model(model, dataset):
x_train, x_test, y_train, y_test = dataset
model.fit(x_train, y_train)
pred = model.predict(x_test)
return accuracy_score(pred, y_test)
CodePudding user response:
The ValueError
is raised because you are trying to unpack more values than the total that exists in the data
variable. So, in your case, the error is telling us that there is only one value inside the data
but you are trying to unpack four.
Make sure you have four values inside the data
variable to unpack.
It's not possible to answer exactly as we don't know what's inside the data
variable. One possible example is how you split the data in sklearn
.
from sklearn import model_selection
x_train, x_valid, y_train, y_valid = model_selection.train_test_split(X, y, test_size=0.2, random_state=0)
Edit1: adding a code snippet that you can reference
# imports
from sklearn import datasets
iris_data = datasets.load_iris() # get the iris data directly from sklearn
#put the dataset into a pandas DF using the feature names as columnsç
#rename the column name so they don't include the '(cm)'
#add 2 columns one with the target and another with the target_names
df = pd.DataFrame(dataset['data'], columns=dataset['feature_names'])
df.columns = ['sepal length', 'sepal width', 'petal length', 'petal width']
df['target'] = dataset['target']
df['class'] = dataset.target_names[dataset.target]
dummy = pd.get_dummies(df_iris, columns=["target"])
X = df_iris[['sepal length','sepal width']].to_numpy() #only selects specified column and converts to numpy arrays
Y = df_iris.target.to_numpy() # target to numpy
#Split the data into train and test
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size = 0.8, random_state = 42)
#normalize the dataset
#create and fit the scaler object on the training data
# do the training