Home > Software engineering >  Compute and plot 'grand' regression over multiple smaller regressions
Compute and plot 'grand' regression over multiple smaller regressions

Time:06-29

I have grouped Area values, for each of which I can compute and plot regressions:

set.seed(123)
df <- data.frame(
  Group = c(rep("A",8), rep("B",10), rep("C",7)),
  Area = c(1,3,2,4,3,5,7,9, rnorm(10), sample(7)),
  x = c(1:8,1:10,1:7)
)

library(ggplot2)
ggplot(df,
       aes(x = x, y = Area, group = factor(Group)))  
  geom_smooth(method = "lm", se = FALSE)

enter image description here

But what I'm looking for is how to compute and plot what could be called a 'grand' regression for all Area groups. Is this possible and how would it be possible?

EDIT: My guess is that it's not enough to simply disregard the group variable by running a model over all Area and all x values and excluding the groupvariable. This would treat the different groups as irrelevant. In actual fact each group represents a distribution in its own right. Consider each group as collecting the values of an independent event . What I need is a model that incorporates the distinction between the groups/events while at the same time summarizing over them.

CodePudding user response:

use two geom_smooth and put the grouping aesthetic into each geom separately

set.seed(123)
df <- data.frame(
  Group = c(rep("A",8), rep("B",10), rep("C",7)),
  Area = c(1,3,2,4,3,5,7,9, rnorm(10), sample(7)),
  x = c(1:8,1:10,1:7)
)

library(ggplot2)
ggplot(df, aes(x = x, y = Area))  
  geom_smooth(aes(group = factor(Group)), method = "lm", se = FALSE)  
  geom_smooth()
#> `geom_smooth()` using formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2022-06-29 by the enter image description here

  • Related