Here is a sample of my data:
import pandas as pd
data = {'tweet': ['saya suka makanan ini sangat enak', 'rasa kuahnya kurang enak, terlalu asin', 'favorit saya nih, ayam gorengnya enak banget', 'nasi bakar di toko ini enak banget!'],
'actual_class': ["Positive", "Negative", "Positive", "Positive"], 'predicted_class': ["Positive", "Positive", "Negative", "Positive"]}
df = pd.DataFrame(data)
I want to count the values of True Positive, False Positive, True Negative, and False Negative between the actual_class
and predicted_class
columns in my dataframe without using scikit-learn
. I tried to code it but I can't find the efficient way.
CodePudding user response:
You can use the value counts function from pandas:
df['required column'].value_counts()
CodePudding user response:
If you cannot use scikit-learn
, but can use pandas
, you might like pandas.crosstab
:
import pandas as pd
data = {'actual_class': ["Positive", "Negative", "Positive", "Positive"], 'predicted_class': ["Positive", "Positive", "Negative", "Positive"]}
df = pd.DataFrame(data)
print(pd.crosstab(df.actual_class, df.predicted_class))
i.e.: you get the same solution you would with import sklearn; print(confusion_matrix(df.actual_class, df.predicted_class))
:
predicted_class Negative Positive
actual_class
Negative 0 1
Positive 1 2