Home > OS >  scikit-learn neural net beginner - results not what I expect
scikit-learn neural net beginner - results not what I expect

Time:05-11

I have a simple example for which I am attempting to perform a classification using the MLPClassifier.

from sklearn.neural_network import MLPClassifier

# What are the features in our X data?
#  0. do .X files exist?
#  1. do .Y files exist?
#  2. does a Z.Z file exist?
# values are 0 for false and 1 for true

training_x = (
    [0,1,0],  # pure .Y files, no Z.Z
    [1,0,1],  # .X files and Z.Z
    [1,0,0],  # .X   w/o Z.Z
)
training_y = ('.Y, no .X, no Z.Z', '.X   Z.Z', '.X w/o Z.Z')
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
                    hidden_layer_sizes=(len(training_x) 1, len(training_x) 1), random_state=1)
# training
clf.fit(training_x, training_y)
# predictions
for i in (0,1):
    for j in (0,1):
        for k in (0,1):
            results = list(clf.predict_proba([[i, j, k]])[0])
            # seems they are reversed:
            results.reverse()
            discrete_results = None
            for index in range(len(training_x)):
                if results[index] > 0.999:
                    if discrete_results is not None:
                        print('hold on a minute')
                    discrete_results = training_y[index]
            print(f'{i},{j},{k} ==> {results}, discrete={discrete_results}')

As I test it with all possible (discrete) inputs, I would expect for the predictions for the input cases: [0,1,0], [1,0,1] and [1,0,0] that I would see a close match to my three training_y cases and for other input cases the results would be under-defined and not of interest. However, those three input cases are not matched at all, unless I reverse the proba results where the [0,1,0] input does match and the other two are swapped. Here is the output with the reverse included:

0,0,0 ==> [1.1527971240749179e-19, 0.0029561479916546647, 0.9970438520083453], discrete=None
0,0,1 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z
0,1,0 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z
0,1,1 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z
1,0,0 ==> [4.971668615064256e-68, 0.9999999980156198, 1.9843802638506693e-09], discrete=.X   Z.Z
1,0,1 ==> [1.3622448606166547e-05, 3.911037287197552e-05, 0.9999472671785217], discrete=.X w/o Z.Z
1,1,0 ==> [3.09415772026147e-33, 0.934313523906787, 0.06568647609321301], discrete=None
1,1,1 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z

I have, no doubt, made a silly beginner's error! Help with finding it would be appreciated.

CodePudding user response:

The order of probabilities from predict_proba are not "reversed", they are stored in (presumably) alphabetical order; you can check the order in the attribute classes_. And instead of discretizing yourself at the threshold 0.999, consider calling predict, which will take the class with largest probability, but more importantly translate back to the text of the class internally.

  • Related