I created a scatter plot that uses data from two sources: x = []
and y = []
. In a second step, I added a linear regression line for the two lists of data above using the following code:
(m, b) = np.polyfit(x, y, 1)
Y_Polyval = np.polyval([m, b], x)
plt.plot(x, Y_Polyval, linewidth=3, c="black")
The result of that is a standard scatterplot as shown below.
Now I would like to add a 95% confidence interval to the black regression line, using plt.fill_between
. I know that there are many topics on this, I read through many of them, but I cannot solve the problem, i.e., adapting a code to my particular code and regression line.
Adding
CI = 1.96 * np.std(y) / np.mean(y)
plt.fill_between(y, (y-CI), (y CI), color='blue', alpha=0.1)
to my code results in the following output below.
The blueish confidence interval by plt.fill_between
is somewhere drawn on the left side of the image, but not around the regression line. What I would like to achieve is that the confidence interval draws around the black regression line. The full code is shown subsequently:
import numpy as np
import matplotlib.pyplot as plt
# Scatter plot
x = [0.472202, 0.685151, 0.287613, 0.546364, 0.518002, 0.675128, 0.462418, 0.61817, 0.692822, 0.23433,
0.194009, 0.720232, 0.597321, 0.625955, 0.660571, 0.737754, 0.436876, 0.689937, 0.483067, 0.646723,
0.699367, 0.384102, 0.561493]
y = [0.131113, 0.123865, 0.150355, 0.138914, 0.140417, 0.119358, 0.130019, 0.129782, 0.113508, 0.13434,
0.15162, 0.125768, 0.128473, 0.128056, 0.114403, 0.142878, 0.139192, 0.118033, 0.132616, 0.133043,
0.133973, 0.146611, 0.129792]
(m, b) = np.polyfit(x, y, 1)
Y_Polyval = np.polyval([m, b], x)
plt.plot(x, Y_Polyval, linewidth=3, c="black")
CI = 1.96 * np.std(y) / np.mean(y)
plt.fill_between(y, (y-CI), (y CI), color='blue', alpha=0.1)
plt.scatter(x, y, s=250, linewidths=2, zorder=2)
plt.show()
CodePudding user response:
You should plot the predicted value Y_Polyval
instead of the true value y
and sort the (x, y)
values to fill the areas:
plt.fill_between(x, (Y_Polyval-CI), (Y_Polyval CI), color='blue', alpha=0.1)
Full Example
import numpy as np
import matplotlib.pyplot as plt
# Scatter plot
x = [0.472202, 0.685151, 0.287613, 0.546364, 0.518002, 0.675128, 0.462418, 0.61817, 0.692822, 0.23433,
0.194009, 0.720232, 0.597321, 0.625955, 0.660571, 0.737754, 0.436876, 0.689937, 0.483067, 0.646723,
0.699367, 0.384102, 0.561493]
y = [0.131113, 0.123865, 0.150355, 0.138914, 0.140417, 0.119358, 0.130019, 0.129782, 0.113508, 0.13434,
0.15162, 0.125768, 0.128473, 0.128056, 0.114403, 0.142878, 0.139192, 0.118033, 0.132616, 0.133043,
0.133973, 0.146611, 0.129792]
# Sort coordinate values
coords = [(a, b) for a, b in zip(x, y)]
coords = sorted(coords, key=lambda x: x[1], reverse=True)
x, y = zip(*coords)
(m, b) = np.polyfit(x, y, 1)
Y_Polyval = np.polyval([m, b], x)
plt.plot(x, Y_Polyval, linewidth=3, c="black")
plt.scatter(x, y, s=250, linewidths=2, zorder=2)
plt.fill_between(x, (Y_Polyval-CI), (Y_Polyval CI), color='blue', alpha=0.1)