I have trained my model using machine learning and want to check with original value. Is i am doing it right? As whenever I change the numbers in 'value' getting the same result.
X_train, X_test, y_train, y_test = train_test_split(normalize(df4), y, test_size=0.2, random_state=0)
rfc1=RandomForestClassifier( random_state=0, max_features='auto', n_estimators= 90, max_depth=8, criterion='gini' )
rfc1.fit(X_train, y_train)
value=[2.60,1.0,3.0,19.0,1.0,1.0,1.0,3.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,4.0,1.0]
print(rfc1.predict([value]))
y_pred=rfc1.predict(X_test)
print(metrics.classification_report(y_test,y_pred))
df_cm = pd.DataFrame(confusion_matrix(y_test,y_pred), index = [i for i in ['Avg Marks','Good Marks','Bad Marks']],
columns = [i for i in ['Avg Marks','Good Marks','Bad Marks']])
plt.figure(figsize = (4,2))
sn.heatmap(df_cm, annot=True, fmt='g')
Model Accuracy is good but still getting always "Hight chances of Good Marks"
['High Chances of Good Marks']
precision recall f1-score support
Average Marks 0.80 0.83 0.82 59
High Chances of Bad Marks 0.81 0.72 0.76 18
High Chances of Good Marks 0.86 0.86 0.86 50
accuracy 0.83 127
macro avg 0.83 0.80 0.81 127
weighted avg 0.83 0.83 0.83 127
CodePudding user response:
Got the half solution to my problem. I was normalizing the model before fit.
But the original data (value) is not normalize.
Below code works but but now my data which i am using for fit is not normalize. Any solution?
X_train, X_test, y_train, y_test = train_test_split(df4, y, test_size=0.2, random_state=0)
rfc1=RandomForestClassifier( random_state=0, max_features='auto', n_estimators= 90, max_depth=8, criterion='gini' )
rfc1.fit(X_train, y_train)
value=[2.60,1.0,3.0,19.0,1.0,1.0,1.0,3.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,4.0,1.0]
print(rfc1.predict([value]))
y_pred=rfc1.predict(X_test)
print(metrics.classification_report(y_test,y_pred))
df_cm = pd.DataFrame(confusion_matrix(y_test,y_pred), index = [i for i in ['Avg Marks','Good Marks','Bad Marks']],
columns = [i for i in ['Avg Marks','Good Marks','Bad Marks']])
plt.figure(figsize = (4,2))
sn.heatmap(df_cm, annot=True, fmt='g')
CodePudding user response:
First of all, you need to define yourself what you want. Normalize data or not.
I recommend you Normalize it.
If you normalize input data, you need to train your data with normalized values; and, you need to evaluated predictions with also normalized data.
You always have to reproduce use the same structure of input data in predictions that you used in the training
Take a look to this link enter link description here to refer the normalization.