I need to get 500 samples of N=50 each and draw a simple linear regression model with X1 as VI and y as DV on each sample. Then, I have to plot all of these samples' predictions on one previous sample dataframe, which I've called sample.dat.
This is my code so far:
geom_point(sample.dat, mapping = aes(x = X1, y = y))
#geom_line(data = lm.fit.dat, aes(y = true.y.fit), color = "blue")
theme_bw()
for(i in 1:500){
df = linear.dat[sample(1:nrow(linear.dat), size = 50),]
g = p geom_smooth(method = lm, data=df, color="red", size=0.5, alpha = 0)
geom_line(data = lm.fit.dat, aes(y = true.y.fit), color = "blue")
}
plot(g)
And this is my output:
As you see, I only have 1 red line when I want 500 red lines of the 500 samples.
CodePudding user response:
The big problem is that in the loop you're adding the new line to p
, but g
isn't accumulating the lines. At the end of the loop, g
will only have the most recently generated line. To accumulate the lines you have to do g <- g ...
- you have to add the current line to all of the previous lines. Something like this should work:
g <- ggplot(sample.dat, mapping = aes(x = X1, y = y))
geom_point()
theme_bw()
for(i in 1:500){
df = linear.dat[sample(1:nrow(linear.dat), size = 50),]
g = g
geom_smooth(method = lm, data=df, color="red", size=0.5, alpha = 0)
}
plot(g)