I am fairly new to Python and working with a Jupyter Notebook in which I am supposed to classify the MNIST dataset using a DecisionTreeClassifier. Now the dataset has previously already been divided into the features and the target variables in seperate files. When reading those in and working with them, I can't seem to get any output, even though it compiles fine. Restarting the Kernel did not solve the issue. Other simpler operations produce an output. It it perhaps due to the size of the data set?
Here's the code:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from joblib import dump
"""
Placehorder for comments:
"""
def mnistDTC():
df = pd.read_csv('./data/mnist_target.csv', index_col = 0)
target = pd.read_csv('data/mnist_target.csv', index_col = 0)
tree_clf = DecisionTreeClassifier()
df_train, df_test, target_train, target_test = train_test_split(df, target, test_size=0.2, random_state=0)
tree_clf.fit(df_train, target_train)
predictions = tree_clf.predict(df_test)
print(predictions[:10])
Thanks in Advance!
CodePudding user response:
You defined the function mnistDC
, but did not call it, which is why there is no output.
In order to call the function, put the following line just after the definition, or in a new cell :
mnistDC()