Got myself in a pickle.
I'm creating a basic table in Matplotlib (via Pandas, but that's not the issue). What I'm trying to accomplish is to create a table where the first column, which will be string values, remains white...but columns 2,3,4,5,6 are floating/integers and will be colored by a custom normalized colormap.
I've started with the basics, and created the 'colored' table via the code below. This only plots the columns with integer values at this point, see here:
What I ulimately need to do is plot this with an additional column, say before column 'A' or after column 'F' which holds string values, e.g. ['MBIAS', 'RMSE', 'BAGSS', 'MBIAS', 'MBIAS'].
However if I try to apply the cellColours method in the code below to a table that mixes lists of strings and float/integers, it obviously fails.
Is there a method to apply a cellColours scheme to only certain cells, or row/columns? Can I loop through, applying the custom colormap to specific cells?
Any help or tips would be appreciated!
Code:
import numpy as np
import matplotlib
from matplotlib import cm
import matplotlib.pyplot as plt
from pandas import *
#Create sample data in pandas dataframe
idx = Index(np.arange(1,6))
df = DataFrame(abs(2*np.random.randn(5, 5)), index=idx, columns=['A', 'B', 'C', 'D', 'E'])
model = ['conusarw', 'conusarw', 'conusarw', 'nam04', 'emhrrr']
df['Model'] = model
df1 = df[['A','B','C','D','E']]
test = df1.round({'A':2,'B':2,'C':2,'D':2,'E':2})
print(test)
vals = test.values
print(vals)
#Creates normalized list (from 0-1) based a user provided range and center of distribution.
norm = matplotlib.colors.TwoSlopeNorm(vmin=0,vcenter=1,vmax=10)
#Merges colormap to the normalized data based on customized normalization pattern from above.
colours = plt.cm.coolwarm(norm(vals))
#Create figure in Matplotlib in which to plot table.
fig = plt.figure(figsize=(15,8))
ax = fig.add_subplot(111, frameon=False, xticks=[], yticks=[])
#Plot table, using pandas dataframe information and data.
#Customized lists of data and names can also be provided.
the_table=plt.table(cellText=vals, rowLabels=model, colLabels=df.columns,
loc='center', cellColours=colours)
plt.savefig('test_table.png')
CodePudding user response:
Instead of the fast vectorized call colours = plt.cm.coolwarm(norm(vals))
, you can just use regular Python loops with if-tests. The code below loops through the individual rows, then through the individual elements and test whether they are numeric. A similar loop prepares the rounded values. Speed is not really a problem, unless you'd have thousands of elements.
(The code uses import pandas as pd
, as import * from pandas
isn't recommended.)
import matplotlib.pyplot as plt
from matplotlib.colors import to_rgba, TwoSlopeNorm
import pandas as pd
import numpy as np
# Create sample data in pandas dataframe
idx = pd.Index(np.arange(1, 6))
df = pd.DataFrame(abs(2 * np.random.randn(5, 5)), index=idx, columns=['A', 'B', 'C', 'D', 'E'])
df['Model'] = ['conusarw', 'conusarw', 'conusarw', 'nam04', 'emhrrr']
cmap = plt.cm.coolwarm
norm = TwoSlopeNorm(vmin=0, vcenter=1, vmax=10)
colours = [['white' if not np.issubdtype(type(val), np.number) else cmap(norm(val)) for val in row]
for row in df.values]
vals = [[val if not np.issubdtype(type(val), np.number) else np.round(val, 2) for val in row]
for row in df.values]
fig = plt.figure(figsize=(15, 8))
ax = fig.add_subplot(111, frameon=False, xticks=[], yticks=[])
the_table = plt.table(cellText=vals, rowLabels=df['Model'].to_list(), colLabels=df.columns,
loc='center', cellColours=colours)
plt.show()
PS: If speed is a concern, the following code is a bit trickier. It uses:
- setting the "bad color" of a colormap
pd.to_numeric(..., errors='coerce')
to convert all strings to nans- as
pd.to_numeric()
only works for 1D arrays,ravel()
andreshape()
are used - using the same arrays,
np.where
can do the rounding
cmap = plt.cm.coolwarm.copy()
cmap.set_bad('white')
norm = TwoSlopeNorm(vmin=0, vcenter=1, vmax=10)
values = pd.to_numeric(df.values.ravel(), errors='coerce').reshape(df.shape)
colours = cmap(norm(values))
vals = np.where(np.isnan(values), df.values, np.round(values, 2))
fig = plt.figure(figsize=(15, 8))
ax = fig.add_subplot(111, frameon=False, xticks=[], yticks=[])
the_table = plt.table(cellText=vals, rowLabels=df['Model'].to_list(), colLabels=df.columns,
loc='center', cellColours=colours)