i have the following problem.
From the following starting matrix called "R" I want to derive a linear model for each "Territorio" (territory) by linking two variables together "Morti_per_abitante" and "PL_per_abitante" in differente ages ("Anno").
To do this I created an empty table called TT in which for each territory I will be able to enter the values of the linear model (slope or "Tasso" and significance or "signficativita")
At this point I created a for loop that working on the TT table based on the one of every 13 "Territorio" would obtain the model with its 2 parameters:
for (i in 1:13){
Ter<-TT[i:1]
filter(R, Territorio==Ter) %>%
Tasso[1:2]<-lm(Morti_per_abitante ~ PL_per_abitante,R)$coefficents[2]
}
The code does not work starting from the inserted filter.
What can I do to correct the line of code?
Is there an alternative way to have the linear model for each Territory?
Thanks
CodePudding user response:
Here is a dplyr
broom
approach. I am using a mock dataframe R
.
library(broom)
library(dplyr)
set.seed(123)
R <- data.frame(Territorio = rep(c("Lazio", "Puglia"), c(10, 5)),
Anno = sample(2015:2020, size = 15, replace = TRUE),
Morti_per_abitante = rnorm(15),
PL_per_abitante = rnorm(15))
R |>
group_by(Territorio) |>
group_modify(~{
tidy(lm(Morti_per_abitante ~ PL_per_abitante, data = .x))
}) |>
filter(term == "PL_per_abitante") |>
rename(Tasso = estimate, Significativita = p.value) |>
select(Territorio, Tasso, Significativita)
#> # A tibble: 2 × 3
#> # Groups: Territorio [2]
#> Territorio Tasso Significativita
#> <chr> <dbl> <dbl>
#> 1 Lazio -0.339 0.367
#> 2 Puglia 0.239 0.592