Home > database >  how to do filter on pandas dataframe?
how to do filter on pandas dataframe?

Time:05-05

Example Code here :

x7 = ['Spammer','Suspicious','Normal','Micro Influencer','Influencer']
rasio_real_spammer = df[(df['Rasio Followers/Followings'] < 0.5) & (df['fake'] == 0)].count()

temp = df[(df['Rasio Followers/Followings'] > 0.5) & (df['Rasio Followers/Followings'] < 1.0)].count()
rasio_real_suspicious = temp & (df['fake'] == 0).count()

temp2=df[(df['Rasio Followers/Followings'] >= 1.0) & (df['Rasio Followers/Followings'] < 2.0)].count()
rasio_real_normal = temp2 & (df['fake'] == 0).count()

temp3=df[(df['Rasio Followers/Followings'] >= 2.0) & (df['Rasio Followers/Followings'] < 10.0)].count()
rasio_real_micro = temp3 & (df['fake'] == 0).count()

rasio_real_influencer = df[(df['Rasio Followers/Followings'] >= 10.0 ) & (df['fake'] == 0)].count()

plt.bar(x7[0], rasio_real_spammer , color='red',label='Spammer')
plt.bar(x7[1], rasio_real_suspicious, color='yellow',label='Suspicious')
plt.bar(x7[2], rasio_real_normal, color='blue',label='Normal')
plt.bar(x7[3], rasio_real_micro, color='green',label='Micro Influencer')
plt.bar(x7[4], rasio_real_influencer, color='gray',label='Influencer')


plt.title("Distribution Rasio on Real Class")
plt.legend()
plt.show()

when I do a manual check of the results of rasio_real_spammer and rasio_real_influencer was correct. but the other results are not correct, maybe an error when filtering the class. any solutions ?

CodePudding user response:

x7 = ['Spammer','Suspicious','Normal','Micro Influencer','Influencer']
rasio_real_spammer = df[(df['Rasio Followers/Followings'] < 0.5) & (df['fake'] == 0)].count()

rasio_real_suspicious = df[(df['Rasio Followers/Followings'] > 0.5) & (df['Rasio Followers/Followings'] < 1.0) & (df['fake'] == 0)].count()

rasio_real_normal=df[(df['Rasio Followers/Followings'] >= 1.0) & (df['Rasio Followers/Followings'] < 2.0) & (df['fake'] == 0)].count()

rasio_real_micro =df[(df['Rasio Followers/Followings'] >= 2.0) & (df['Rasio Followers/Followings'] < 10.0) & (df['fake'] == 0)].count()
 
rasio_real_influencer = df[(df['Rasio Followers/Followings'] >= 10.0 ) & (df['fake'] == 0)].count()

plt.bar(x7[0], rasio_real_spammer , color='red',label='Spammer')
plt.bar(x7[1], rasio_real_suspicious, color='yellow',label='Suspicious')
plt.bar(x7[2], rasio_real_normal, color='blue',label='Normal')
plt.bar(x7[3], rasio_real_micro, color='green',label='Micro Influencer')
plt.bar(x7[4], rasio_real_influencer, color='gray',label='Influencer')


plt.title("Distribution Rasio on Real Class")
plt.legend()
plt.show()
  • Related