Home > front end >  How to plot the distribution of a third variable in a 2d histogram?
How to plot the distribution of a third variable in a 2d histogram?

Time:02-24

Imagine you have a data set in three dimensions, x, y and z, and you want to show their relation. You could do this for example using a scatter plot in x and y and adding information about z with the help of a colormap:

enter image description here

But such a plot can be hard to read or even missleading, so I would like to use a 2d-histogram in x and y instead and weigh each data point by their z value:

enter image description here

However, as can be seen by the plot above, the magnitude of bin values can now be much higher than the maximum in z, which makes sense of course, as the bin values are usually the sums several z values.

So weighing by their z value is not enough, I also need to "normalize" each bin value by the number of data points within it. But as can be seen on the right plot above, for some reason, this doesn't seem to work. The color value range remains unchanged.

What am I doing wrong and is there a better approach to do this?

Code for reproduction (loosely based on enter image description here

  • Related