I know there are packages for visualizing missing values like missingno
. How can I visualize missing values patterns without additional packages using Pandas and Matplotlib? I expect something like the following image where missing data is white:
CodePudding user response:
You can get what you need using matplot:
import pandas as pd
plt.rcParams["figure.figsize"] = (20, 10)
df = pd.read_excel("C:/Users/Marco.Bressanelli/Desktop/titanic.xlsx")
plt.imshow(df.isnull(), cmap='hot', aspect='auto')
plt.show()
note: I used a subset of titanic data from kaggle.
result:
Starting from index 0, this heatmap visualization immediately tells us how (and where) missing values are distributed.
I know, i'ts not so fancy right now. Matplot takes more work to turn this raw graphic into something nicer.
But if you want something better and fast, i really suggest seaborn
.
Seaborn
is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
import seaborn as sns
sns.heatmap(df.isnull(), cbar=False)
plt.show()