Let say we have a dataframe like this:
import pandas as pd,numpy as np
df = pd.DataFrame({'id': ['1','2','3','4','5','6','7','8','9','10'],
'name': ['Arnold','Barney','Clark','Devi','Erik','Fiona','Genie','Harry','Isabella','Jane'],
'score': [89,92,70,np.NaN,78,np.NaN,80,94,np.NaN,100]})
I tried to percentage the missing value for column score
df[['score']].isnull().mean().mul(100).round(2).astype(str) '%'
But I think it's better to visualize he percentage, so how to make a pie chart for missing value only in column score?
CodePudding user response:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Define the DataFrame
df = pd.DataFrame({'id': ['1','2','3','4','5','6','7','8','9','10'],
'name': ['Arnold','Barney','Clark','Devi','Erik','Fiona','Genie','Harry','Isabella','Jane'],
'score': [89,92,70,np.NaN,78,np.NaN,80,94,np.NaN,100]})
# Calculate the percentage of missing values
missing_pct = df[['score']].isnull().mean().mul(100).round(2).values[0]
# Create a pie chart
plt.pie([missing_pct, 100 - missing_pct], labels=['Missing', 'Not missing'], autopct='%1.1f%%')
# Add a title
plt.title('Percentage of missing values in score column')
# Display the chart
plt.show()
This will create a pie chart showing the percentage of missing values in the score column. You can customize the chart further by setting different colors, fonts, etc. using the various options available in Matplotlib.