Home > Software engineering >  How to visualize missing values patterns in Pandas
How to visualize missing values patterns in Pandas

Time:10-13

I know there are packages for visualizing missing values like missingno. How can I visualize missing values patterns without additional packages using Pandas and Matplotlib? I expect something like the following image where missing data is white:

enter image description here

CodePudding user response:

You can get what you need using matplot:

import pandas as pd
plt.rcParams["figure.figsize"] = (20, 10)
df = pd.read_excel("C:/Users/Marco.Bressanelli/Desktop/titanic.xlsx")
plt.imshow(df.isnull(), cmap='hot', aspect='auto')
plt.show()

note: I used a subset of titanic data from kaggle.

result:

Starting from index 0, this heatmap visualization immediately tells us how (and where) missing values are distributed.

enter image description here

I know, i'ts not so fancy right now. Matplot takes more work to turn this raw graphic into something nicer.

But if you want something better and fast, i really suggest seaborn.

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

import seaborn as sns
sns.heatmap(df.isnull(), cbar=False)
plt.show()

enter image description here

  • Related