Home > database >  Geopandas legend not showing bins for which there are zero observations in dataset
Geopandas legend not showing bins for which there are zero observations in dataset

Time:02-23

I'm trying to classify the number of bird species found in US states. The values fall between 250 and 750, so I'm dividing them up into 10 bins of 50 (using MapClassify's UserDefined classifier). Here's the code that generates the plot, which, besides the legend, is coming through fine:

ud_10 = mc.UserDefined(gdf['NumSpecies'], bins=np.arange(300, 800, 50), lowest=250)
gdf['cl'] = ud_10.yb # This creates a column that displays the bin number for each observation

vmin, vmax = gdf['cl'].agg(['min', 'max'])
gdf.drop['AK', 'HI'].plot('cl', ax=continental_ax, legend=True, categorical=True, 
                          cmap='viridis_r', legend_kwds:{'loc': 'lower right'})
gdf.loc[['AK']].plot(column='cl', ax=alaska_ax, cmap=colormap, vmax=vmax, vmin=vmin)
gdf.loc[['HI']].plot(column='cl', ax=hawaii_ax, cmap=colormap, vmax=vmax, vmin=vmin)

And the plot: enter image description here

So, see how the legend is missing the numbers 0 and 7? Those are the ones absent from the data in the "continental" plot call above (Hawaii, plotted separately, is in bin 0, and there's no data at ALL that fall in bin 7). It seems, then, that the geopandas legend does not take into account any bins for which observations = 0. Do you know of any way I can remedy this?

Thank you so, so much for any help you can provide!

CodePudding user response:

In addition to Matthew's answer, another workaround I've found is to ditch the mapclassify package altogether and use pd.cut() to categorize your data instead. For whatever reason, using this method allowed the legend to display bins that had no observations.

bins = np.arange(250, 800, 50)
labels = np.arange(0, 10)

gdf['cl'] = pd.cut(gdf['NumSpecies'], bins=bins, labels=labels)

And then you can simply call gdf.plot('cl') along with whatever remaining args/kwargs you want.

CodePudding user response:

You can use patches to assign custom values to the legend.

import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import geopandas as gpd
from shapely.geometry import LineString

coords = [LineString([(0,1), (1,2), (2,3), (3,4)]),
          LineString([(4,5), (5,6), (6,7), (7,8)]),
          LineString([(7,1), (6,2), (5,4), (4,3)])]

gdf = gpd.GeoDataFrame(geometry=coords)
gdf.plot(color=['red', 'orange', 'blue'])

# keys are color you want, values are legend labels
patch_dict = {'red':'0', 'orange':'1', 'blue':'2'}

patch_list = []
for k, v in patch_dict.items():

    patch_list.append(mpatches.Patch(color=k, label=v))
plt.legend(handles=patch_list, loc='upper left')

enter image description here

  • Related