I have grouped Area
values, for each of which I can compute and plot regressions:
set.seed(123)
df <- data.frame(
Group = c(rep("A",8), rep("B",10), rep("C",7)),
Area = c(1,3,2,4,3,5,7,9, rnorm(10), sample(7)),
x = c(1:8,1:10,1:7)
)
library(ggplot2)
ggplot(df,
aes(x = x, y = Area, group = factor(Group)))
geom_smooth(method = "lm", se = FALSE)
But what I'm looking for is how to compute and plot what could be called a 'grand' regression for all Area
groups. Is this possible and how would it be possible?
EDIT:
My guess is that it's not enough to simply disregard the group
variable by running a model over all Area
and all x
values and excluding the group
variable. This would treat the different groups as irrelevant. In actual fact each group represents a distribution in its own right. Consider each group as collecting the values of an independent event . What I need is a model that incorporates the distinction between the groups/events while at the same time summarizing over them.
CodePudding user response:
use two geom_smooth and put the grouping aesthetic into each geom separately
set.seed(123)
df <- data.frame(
Group = c(rep("A",8), rep("B",10), rep("C",7)),
Area = c(1,3,2,4,3,5,7,9, rnorm(10), sample(7)),
x = c(1:8,1:10,1:7)
)
library(ggplot2)
ggplot(df, aes(x = x, y = Area))
geom_smooth(aes(group = factor(Group)), method = "lm", se = FALSE)
geom_smooth()
#> `geom_smooth()` using formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'