Home > Software engineering >  Show Mean in Scatterplot
Show Mean in Scatterplot

Time:09-28

I have a dataset with three columns. The first shows the actual data and the second shows the mean of the first. The third is a category. I want to show both the data and the mean in a scatterplot. However, the mean is plotted N times, not only once - making it difficult to control the visualization. Is there a way to plot the mean only once?

Here is a toy example where the orange dots (indicating the mean) are plotted 14 times on top of each other.

import matplotlib.pyplot as plt
import pandas as pd
  

df = pd.DataFrame({
    'Numbers': [10,20,30,40,50,60,11,59,12,58,13,57,25,45],
    'Mean':    [35,35,35,35,35,35,35,35,35,35,35,35,35,35],
    'y':       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ]    
    })

fig, ax = plt.subplots(1)
ax.scatter(y="y", x= "Numbers", data=df, s = 200, alpha = .5, label = "Mean response")
ax.scatter(y="y", x= "Mean", data=df, s = 200, alpha = .15, label = "Average mean response")
legend = ax.legend(loc="lower right", framealpha = .4)

enter image description here

CodePudding user response:

Just plot one value:

fig, ax = plt.subplots(1)
ax.scatter(y="y", x="Numbers", data=df, s=200,
           alpha=.5, label="Average mean response")

ax.scatter(df['Numbers'].mean(), df['y'].mean(),  # plot one value, not 15
           s=200, alpha=.15,
           label="Average mean response")

legend = ax.legend(loc="lower right", framealpha = .4)

output:

scatter plot with mean

If you have several groups:

ax.scatter(x='Numbers', y='y',
           data=df.groupby('y', as_index=False)['Numbers'].mean(),
           s=200, alpha=.15, label="Average mean response")

output:

scatter plot mean several groups

  • Related